Query-focused multidocument summarization based on hybrid relevance analysis and surface feature salience

作者: Jen-Yuan Yeh , Wei-Pang Yang , Hao-Ren Ke

DOI:

关键词: Feature (machine learning)SentenceInformation retrievalInformation needsSalience (neuroscience)Vector space modelComputer scienceAutomatic summarizationRelevance (information retrieval)Latent semantic analysis

摘要: Query-focused multidocument summarization is to synthesize from a set of topic-related documents brief, well-organized, fluent summary for the purpose answering an information need that cannot be met by just stating name, date, quantity, etc. In this paper, task essentially treated as sentence retrieval task. We propose hybrid relevance analysis evaluate query. This achieved combining similarities computed vector space model and latent semantic analysis. Surface features are also examined discern impact low-level query-focused summarization. addition, modified Maximal Marginal Relevance proposed reduce redundancy taking into account shallow feature salience. The experimental results show method obtained competitive when evaluated with DUC 2005 corpus.

参考文章(12)
Sasha Blair-Goldensohn, From Definitions to Complex Topics: Columbia University at DUC 2005 Proceedings of the 5th Document Understanding Conference (DUC2005). ,(2005) , 10.7916/D8V12D70
Jen-Yuan Yeh, Hao-Ren Ke, Wei-Pang Yang, I-Heng Meng, Text summarization using a trainable summarizer and latent semantic analysis Information Processing and Management. ,vol. 41, pp. 75- 95 ,(2005) , 10.1016/J.IPM.2004.04.003
James Allan, Courtney Wade, Alvaro Bolivar, Retrieval and novelty detection at the sentence level international acm sigir conference on research and development in information retrieval. pp. 314- 321 ,(2003) , 10.1145/860435.860493
Jaime Carbinell, Jade Goldstein, The use of MMR, diversity-based reranking for reordering documents and producing summaries international acm sigir conference on research and development in information retrieval. ,vol. 51, pp. 335- 336 ,(1998) , 10.1145/3130348.3130369
Chin-Yew Lin, Training a selection function for extraction conference on information and knowledge management. pp. 55- 62 ,(1999) , 10.1145/319950.319957
Hal Daumé, Daniel Marcu, Bayesian Query-Focused Summarization meeting of the association for computational linguistics. pp. 305- 312 ,(2006) , 10.3115/1220175.1220214
Min-Yen Kan, Long Qiu, Tat-Seng Chua, Shiren Ye, NUS at DUC 2005: Understanding Documents via Concept Links ,(2005)
Wauter Bosma, Query-Based Summarization using Rhetorical Structure Theory computational linguistics in the netherlands. ,vol. 4, pp. 29- 44 ,(2005)
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Mingli Wu, Qing Chen, Wei Li, Baoli Li, Wenjie Li, The Hong Kong Polytechnic University at DUC2005 ,(2005)