On-line event and topic detection by using the compact sets clustering algorithm

作者: Aurora Pons-Porrata , Rafael Berlanga-Llavori , José Ruiz-Shulcloper

DOI:

关键词: Measure (mathematics)Structure (mathematical logic)Similarity (network science)Computer scienceHierarchy (mathematics)Line (geometry)Canopy clustering algorithmCluster analysisData miningEvent (computing)

摘要: In this paper we propose a new incremental clustering algorithm for Event Detection, which is based on the mathematical properties of compact sets. Additionally, makes use temporal references appearing in document texts to measure similarity between documents according events that they describe. order discover structure topics and composite events, hierarchically applied stream newspaper articles. Thus, first level, with high temporal-semantic are clustered together into events. next levels hierarchy, these successively more complex topics. The evaluation results demonstrate regarding improves quality system-generated clusters, overall performance proposed system compares favorably other on-line detection systems literature.

参考文章(7)
D. Llidó, R. Berlanga, M. J. Aramburu, Extracting Temporal References to Assign Document Event-Time Periods database and expert systems applications. pp. 62- 71 ,(2001) , 10.1007/3-540-44759-8_8
Yiming Yang, Tom Pierce, Jaime Carbonell, A study of retrospective and on-line event detection international acm sigir conference on research and development in information retrieval. pp. 28- 36 ,(1998) , 10.1145/290941.290953
Russell Swan, James Allan, Automatic generation of overview timelines international acm sigir conference on research and development in information retrieval. pp. 49- 56 ,(2000) , 10.1145/345508.345546
Bjornar Larsen, Chinatsu Aone, None, Fast and effective text mining using linear-time document clustering knowledge discovery and data mining. pp. 16- 22 ,(1999) , 10.1145/312129.312186
Victor Lavrenko, James Allan, Vikas Khandelwal, David Frey, UMass at TDT 2000 ,(2000)
John E. Hopcroft, Jeffrey Ullman, Alfred V. Aho, Data Structures and Algorithms ,(1983)