An Ontology-based Summarization System for Arabic Documents (OSSAD)

作者: Ibrahim Imam , Nihal Nounou , Alaa Hamouda , Hebat Allah Abdul Khalek

DOI: 10.5120/12980-0237

关键词: WordNetArtificial intelligenceDomain (software engineering)Web resourceNatural language processingInformation retrievalComputer scienceAutomatic summarizationDomain knowledgeKnowledge baseOntology (information science)Set (abstract data type)Decision tree learningArabic

摘要: With the problem of increased web resources and huge amount information available, necessity having automatic summarization systems appeared. Since is needed most in process searching for on web, where user aims at a certain domain interest according to his query, domain-based summaries would serve best. Despite existence plenty research work English, there lack them Arabic due shortage existing knowledge bases. In this paper an Ontology-based Summarization System Documents, OSSAD, introduced. Domain extracted from corpus represented by topic related concepts/keywords lexical relations among them. The user’s query first expanded using WordNet then adding domain-specific base expansion. For summarization, decision tree algorithm (C4.5) used, which was trained set features original documents. testing dataset, Essex Summaries Corpus (EASC) used. Recall Oriented Understudy Gisting Evaluation (ROUGE) used compare OSSAD with human along other systems, showing that proposed approach demonstrated promising results.

参考文章(21)
Udo Kruschwitz, Mahmoud El-Haj, Chris Fox, Using Mechanical Turk to Create a Corpus of Arabic Summaries LREC 2010. ,(2010)
Lars Schmidt-Thieme, Aleksander Pivk, Steffen Staab, Philipp Cimiano, Learning Taxonomic Relations from Heterogeneous Evidence Ontology learning from text : methods, evaluation and applications. Ed. by Paul Buitelaar.. pp. 15- ,(2004)
Jiawei Han, Kavita Ganesan, ChengXiang Zhai, Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions international conference on computational linguistics. pp. 340- 348 ,(2010)
Dragomir R. Radev, Eduard Hovy, Kathleen McKeown, Introduction to the special issue on summarization Computational Linguistics. ,vol. 28, pp. 399- 408 ,(2002) , 10.1162/089120102762671927
Katerina Frantzi, Sophia Ananiadou, Hideki Mima, Automatic recognition of multi-word terms:. the C-value/NC-value method International Journal on Digital Libraries. ,vol. 3, pp. 115- 130 ,(2000) , 10.1007/S007999900023
Jonas Sjöbergh, Older versions of the ROUGEeval summarization evaluation system were easier to fool Information Processing & Management. ,vol. 43, pp. 1500- 1505 ,(2007) , 10.1016/J.IPM.2007.01.014
Ping Chen, R. Verma, A Query-Based Medical Information Summarization System Using Ontology Knowledge computer-based medical systems. pp. 37- 42 ,(2006) , 10.1109/CBMS.2006.25
Maryam Hazman, Samhaa R. El-Beltagy, Ahmed Rafea, A Survey of Ontology Learning Approaches International Journal of Computer Applications. ,vol. 22, pp. 36- 43 ,(2011) , 10.5120/2610-3642
Vivi Nastase, Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 763- 772 ,(2008) , 10.3115/1613715.1613812