Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches

作者: YLLIAS CHALI , SADID A. HASAN

DOI: 10.1017/S1351324911000167

关键词: Semantic similarityArtificial intelligenceSemi-supervised learningComputer scienceMachine learningSupport vector machineNatural language processingAutomatic summarizationSupervised learningMulti-document summarizationConditional random fieldSimilarity measure

摘要: In this paper, we apply different supervised learning techniques to build query-focused multi-document summarization systems, where the task is produce automatic summaries in response a given query or specific information request stated by user. A huge amount of labeled data prerequisite for training. It expensive and time-consuming when humans perform labeling manually. Automatic can be good remedy problem. We employ five annotation extracts from human abstracts using ROUGE, Basic Element overlap, syntactic similarity measure, semantic Extended String Subsequence Kernel. The methods use are Support Vector Machines, Conditional Random Fields, Hidden Markov Models, Maximum Entropy, two ensemble-based approaches. During experiments, analyze impact on performance applied methods. To our knowledge, no other study has deeply investigated compared effects approaches domain summarization.

参考文章(63)
Shafiq R. Joty, A SVM-Based Ensemble Approach to Multi-Document Summarization canadian conference on artificial intelligence. pp. 199- 202 ,(2009) , 10.1007/978-3-642-01818-3_23
Eduard H. Hovy, Liang Zhou, Junichi Fukumoto, Chin-Yew Lin, Automated Summarization Evaluation with Basic Elements. language resources and evaluation. pp. 899- 902 ,(2006)
Thomas G. Dietterich, Ensemble Methods in Machine Learning Multiple Classifier Systems. pp. 1- 15 ,(2000) , 10.1007/3-540-45014-9_1
Eugene Charniak, A maximum-entropy-inspired parser north american chapter of the association for computational linguistics. pp. 132- 139 ,(2000)
Jón Atli Benediktsson, Fabio Roli, Josef Kittler, Proceedings of the 8th International Workshop on Multiple Classifier Systems multiple classifier systems. ,(2009)
Thorsten Joachims, Making large-scale support vector machine learning practical Advances in kernel methods. pp. 169- 184 ,(1999)
Elizabeth Liddy, None, Advances in Automatic Text Summarization Information Retrieval. ,vol. 4, pp. 82- 83 ,(2001) , 10.1023/A:1011476409104
Rebecca Passonneau, Ani Nenkova, Aaron Harnly, Owen C. Rambow, Automation of Summary Evaluation by the Pyramid Method recent advances in natural language processing. pp. 226- 232 ,(2005) , 10.7916/D82J6M8G
Niall Rooney, Alexey Tsymbal, Sarab S. Anand, David W. Patterson, Random subspacing for regression ensembles the florida ai research society. pp. 532- 537 ,(2004)