Internet-scale information monitoring: a continual query approach

作者: Wei Tang , Ling Liu

DOI:

关键词:

摘要: Information monitoring systems are publish-subscribe that continuously track information changes and notify users (or programs acting on behalf of humans) relevant updates according to specified thresholds. Internet-scale presents a number new challenges. First, automated change detection is harder when sources autonomous performed asynchronously. Second, source heterogeneity makes the problem modelling representing than ever. Third, efficient scalable mechanisms needed handle large growing thousands or even millions triggers fired at multiple sources. In this dissertation, we model users' requests using continual queries (CQs) present suite solutions scale over structured semistructured data A CQ standing query monitors for interesting events (triggers) notifies meet In first system level facilities building an system, including design development two operational OpenCQ WebCQ, engineering issues involved, our solutions. We then describe research challenges specific large-scale techniques developed in context WebCQ address these Example include how efficiently process queries, what effective distributed trigger capable handling tens firing hundreds sources, effectively disseminate fresh right time. have optimize processing grouping scheme, auxiliary structure support group-based indexing CQs, differential evaluation algorithm (DRA). The third contribution experimental testbed validate engaged both measurements real (OpenCQ/WebCQ) simulation-based approach. To knowledge, documented dissertation date one focused study queries.

参考文章(75)
Guruduth Banavar, Mark Astley, Daniel Sturman, Joshua Auerbach, Robert Strom, Lukasz Opyrchal, Exploiting IP multicast in content-based publish-subscribe systems Lecture Notes in Computer Science. pp. 185- 207 ,(2000) , 10.5555/338283.338363
Paolo Merialdo, Paolo Atzeni, Giansalvatore Mecca, To Weave the Web very large data bases. pp. 206- 215 ,(1997)
Ling Liu, C. Pu, R. Barga, Tong Zhou, Differential evaluation of continual queries international conference on distributed computing systems. pp. 458- 465 ,(1996) , 10.1109/ICDCS.1996.507994
Yunyue Zhu, Dennis Shasha, StatStream: statistical monitoring of thousands of data streams in real time very large data bases. pp. 358- 369 ,(2002) , 10.1016/B978-155860869-6/50039-1
L. Liu, C. Pu, W. Han, XWRAP: an XML-enabled wrapper construction system for Web information sources international conference on data engineering. pp. 611- 621 ,(2000) , 10.1109/ICDE.2000.839475
Hector Garcia-Molina, Tak W. Yan, SIFT: a tool for wide-area information dissemination usenix annual technical conference. pp. 15- 15 ,(1995)
Françoise Fabret, H. Arno Jacobsen, François Llirbat, Joăo Pereira, Kenneth A. Ross, Dennis Shasha, Filtering algorithms and implementation for very fast publish/subscribe systems international conference on management of data. ,vol. 30, pp. 115- 126 ,(2001) , 10.1145/375663.375677
Tak W. Yan, Hector Garcia-Molina, The SIFT information dissemination system ACM Transactions on Database Systems. ,vol. 24, pp. 529- 565 ,(1999) , 10.1145/331983.331992
Adam Dingle, Tomáš Pártl, Web cache coherence the web conference. ,vol. 28, pp. 907- 920 ,(1996) , 10.1016/0169-7552(96)00020-7