RankPref: Ranking Sentences Describing Relations between Biomedical Entities with an Application

作者: Catalina Oana Tudor , K Vijay-Shanker

DOI:

关键词:

摘要: This paper presents a machine learning approach that selects and, more generally, ranks sentences containing clear relations between genes and terms are related to them. is treated as binary classification task, where preference judgments used learn how choose sentence from pair of sentences. Features capture the relationship described textually, well central in sentence, process. Simplification complex into simple structures also applied for extraction features. We show such simplification improves results by up 13%. conducted three different evaluations we found system significantly outperforms baselines.

参考文章(25)
William R. Hersh, Aaron M. Cohen, Hari Krishna Rekapalli, A comparative analysis of retrieval features used in the TREC 2006 Genomics Track passage retrieval task. american medical informatics association annual symposium. ,vol. 2007, pp. 620- 624 ,(2007)
Donald C. Comeau, Lana Yeganova, Won Kim, W. John Wilbur, Text Mining Techniques for Leveraging Positively Labeled Data meeting of the association for computational linguistics. pp. 155- 163 ,(2011)
Yoshimasa Tsuruoka, Yuka Tateishi, Jin-Dong Kim, Tomoko Ohta, John McNaught, Sophia Ananiadou, Jun’ichi Tsujii, Developing a Robust Part-of-Speech Tagger for Biomedical Text Advances in Informatics. pp. 382- 392 ,(2005) , 10.1007/11573036_36
Jurgen Van Gael, Xiaojin Zhu, Mark Craven, Andrew B. Goldberg, David Andrzejewski, Burr Settles, Ranking Biomedical Passages for Relevance and Diversity: University of Wisconsin, Madison at TREC Genomics 2006. text retrieval conference. ,(2006)
Michael J. Cafarella, Oren Etzioni, Stephen Soderland, Michele Banko, Matt Broadhead, Open information extraction from the web international joint conference on artificial intelligence. pp. 2670- 2676 ,(2007)
Brian P Suomela, Miguel A Andrade, Ranking the whole MEDLINE database according to a large training set using text indexing BMC Bioinformatics. ,vol. 6, pp. 75- 75 ,(2005) , 10.1186/1471-2105-6-75
Szymon Kaczanowski, Pawel Siedlecki, Piotr Zielenkiewicz, The High Throughput Sequence Annotation Service (HT-SAS) - the shortcut from sequence to true Medline words. BMC Bioinformatics. ,vol. 10, pp. 148- 148 ,(2009) , 10.1186/1471-2105-10-148
R. Chandrasekar, Christine Doran, B. Srinivas, Motivations and methods for text simplification Proceedings of the 16th conference on Computational linguistics -. pp. 1041- 1044 ,(1996) , 10.3115/993268.993361
Kalervo Järvelin, Jaana Kekäläinen, Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems. ,vol. 20, pp. 422- 446 ,(2002) , 10.1145/582415.582418
Catalina O Tudor, Carl J Schmidt, K Vijay-Shanker, eGIFT: Mining Gene Information from the Literature BMC Bioinformatics. ,vol. 11, pp. 418- 418 ,(2010) , 10.1186/1471-2105-11-418