An exploration of document impact on graph-based multi-document summarization

作者: Xiaojun Wan

DOI: 10.3115/1613715.1613811

关键词: Computer scienceInformation retrievalMulti-document summarizationAutomatic summarizationGraph basedGraph (abstract data type)Ranking

摘要: The graph-based ranking algorithm has been recently exploited for multi-document summarization by making only use of the sentence-to-sentence relationships in documents, under assumption that all sentences are indistinguishable. However, given a document set to be summarized, different documents usually not equally important, and moreover, specific differently important. This paper aims explore impact on performance. We propose document-based graph model incorporate document-level information sentence-to-document relationship into process. Various methods employed evaluate two factors. Experimental results DUC2001 DUC2002 datasets demonstrate good effectiveness proposed model. Moreover, show robustness

参考文章(24)
Günes Erkan, Dragomir R. Radev, LexPageRank: Prestige in Multi-Document Text Summarization empirical methods in natural language processing. pp. 365- 371 ,(2004)
Jianguo Xiao, Xiaojun Wan, Jianwu Yang, Manifold-ranking based topic-focused multi-document summarization international joint conference on artificial intelligence. pp. 2903- 2908 ,(2007)
Rada Mihalcea, Paul Tarau, A Language Independent Algorithm for Single and Multiple Document Summarization international joint conference on natural language processing. ,(2005)
Judith L. Klavans, Vasileios Hatzivassiloglou, Kathleen R. McKeown, Eleazar Eskin, Regina Barzilay, Towards multidocument summarization by reformulation: progress and prospects national conference on artificial intelligence. pp. 453- 460 ,(1999) , 10.7916/D8SB4F3V
Inderjeet Mani, Eric Bloedorn, Summarizing similarities and differences among related documents Information Retrieval. ,vol. 1, pp. 373- 387 ,(1997) , 10.1023/A:1009930203452
Marti A. Hearst, TextTiling: segmenting text into multi-paragraph subtopic passages Computational Linguistics. ,vol. 23, pp. 33- 64 ,(1997)
Rajeev Motwani, Terry Winograd, Lawrence Page, Sergey Brin, The PageRank Citation Ranking : Bringing Order to the Web the web conference. ,vol. 98, pp. 161- 172 ,(1999)
Xiaojun Wan, Jianwu Yang, Improved affinity graph based multi-document summarization Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX - NAACL '06. pp. 181- 184 ,(2006) , 10.3115/1614049.1614095
Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, Jaime Carbonell, Summarizing text documents: sentence selection and evaluation metrics international acm sigir conference on research and development in information retrieval. pp. 121- 128 ,(1999) , 10.1145/312624.312665