Making Biographical Data in Wikipedia Readable: A Pattern-based Multilingual Approach

作者: Itziar Gonzalez-Dios , María Jesús Aranzabe , Arantza Díaz de Ilarraza

DOI: 10.3115/V1/W14-5602

关键词: Question generationNatural language processingSimple (philosophy)ParagraphArtificial intelligenceComputer science

摘要: In this paper we present Biografix, a pattern based tool that simplifies parenthetical structures with biographical information, whose aim is to create simple, readable and accessible sentences. To end, analysed the appear in first paragraph of Basque Wikipedia, concentrated on biographies. Although it has been designed developed for adapted evaluated other five languages. We also perform an extrinsic evaluation question generation system see if Biografix improve its results.

参考文章(14)
David Hardcastle, Catalina Hallett, Automatic Rewriting of Patient Record Narratives. language resources and evaluation. ,(2008)
Lucia Specia, Sandra Maria Alu'isio, Ann Copestake, Arnaldo Candido Jr, Towards an on-demand Simple Portuguese Wikipedia Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies. pp. 137- 147 ,(2011)
Violeta Seretan, Acquisition of Syntactic Simplification Rules for French language resources and evaluation. pp. 4019- 4026 ,(2012)
Nicole Dehé, Yordanka Kavalova, Parentheticals : An introduction pp. 1- 24 ,(2007)
Biljana Drndarević, Sanja Štajner, Stefan Bott, Susana Bautista, Horacio Saggion, Automatic text simplification in spanish: a comparative evaluation of complementing modules international conference on computational linguistics. pp. 488- 500 ,(2013) , 10.1007/978-3-642-37256-8_40
Agurtzane Azpeitia Eizagirre, Enuntziatu parentetikoak: Koldo Mitxelenaren intentzio ironikoaren ispilu Gogoa: Euskal Herriko Unibersitateko hizkuntza, ezagutza, komunikazio eta ekintzari buruzko aldizkaria. ,vol. 10, pp. 21- 54 ,(2011)
Itziar Gonzalez-Dios, Itziar Aldabe, Iñigo Lopez-Gazpio, Montse Maritxalar, Ion Madrazo, Two approaches to generate questions in Basque Procesamiento Del Lenguaje Natural. ,vol. 51, pp. 101- 108 ,(2013)
Pablo A. Duboue, Kathleen R. McKeown, Vasileios Hatzivassiloglou, PROGENIE: biographical descriptions for intelligence analysis intelligence and security informatics. pp. 343- 345 ,(2003) , 10.1007/3-540-44853-5_26
Diane Blakemore, Divisions of labour: The analysis of parentheticals Lingua. ,vol. 116, pp. 1670- 1687 ,(2006) , 10.1016/J.LINGUA.2005.04.007
Gondy Leroy, David Kauchak, Obay Mouradi, A user-study measuring the effects of lexical simplification and coherence enhancement on perceived and actual text difficulty. International Journal of Medical Informatics. ,vol. 82, pp. 717- 730 ,(2013) , 10.1016/J.IJMEDINF.2013.03.001