New event detection based on indexing-tree and named entity

作者: Kuo Zhang , Juan Zi , Li Gang Wu

DOI: 10.1145/1277741.1277780

关键词: Machine learningTerm (time)Computer scienceSearch engine indexingTree (data structure)Class (biology)Event (computing)Task (project management)Named entityArtificial intelligenceLinguistic Data ConsortiumData mining

摘要: New Event Detection (NED) aims at detecting from one or multiple streams of news stories that which is reported on a new event (i.e. not previously). With the overwhelming volume available today, there an increasing need for NED system able to detect events more efficiently and accurately. In this paper we propose model speed up task by using indexing-tree dynamically. Moreover, based observation terms different types have effects task, two term reweighting approaches are proposed improve accuracy. first approach, adjust weights dynamically previous story clusters in second employ statistics training data learn named entity each class stories. Experimental results Linguistic Data Consortium (LDC) datasets TDT2 TDT3 show can both efficiency accuracy significantly, compared baseline other existing systems.

参考文章(15)
R. Papka, J. Allan, On-Line New Event Detection using Single Pass Clustering TITLE2: University of Massachusetts. ,(1998)
James Allan, None, Topic detection and tracking: event-based information organization Kluwer Academic Publishers. ,(2002)
James P. Callan, W. Bruce Croft, Stephen M. Harding, The INQUERY Retrieval System database and expert systems applications. pp. 78- 83 ,(1992) , 10.1007/978-3-7091-7557-6_14
Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Simple Semantics in Topic Detection and Tracking Information Retrieval. ,vol. 7, pp. 347- 368 ,(2004) , 10.1023/B:INRT.0000011210.12953.86
Yiming Yang, Tom Pierce, Jaime Carbonell, A study of retrospective and on-line event detection international acm sigir conference on research and development in information retrieval. pp. 28- 36 ,(1998) , 10.1145/290941.290953
Thorsten Brants, Francine Chen, Ayman Farahat, A System for new event detection international acm sigir conference on research and development in information retrieval. pp. 330- 337 ,(2003) , 10.1145/860435.860495
Nicola Stokes, Joe Carthy, Combining semantic and syntactic document classifiers to improve first story detection international acm sigir conference on research and development in information retrieval. pp. 424- 425 ,(2001) , 10.1145/383952.384068
Giridhar Kumaran, James Allan, Using names and topics for new event detection Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 121- 128 ,(2005) , 10.3115/1220575.1220591
Robert E. Schapire, Yoram Singer, BoosTexter: A Boosting-based Systemfor Text Categorization Machine Learning. ,vol. 39, pp. 135- 168 ,(2000) , 10.1023/A:1007649029923
Yiming Yang, Jian Zhang, Jaime Carbonell, Chun Jin, Topic-conditioned novelty detection Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02. pp. 688- 693 ,(2002) , 10.1145/775047.775150