Concept extraction and association from cancer literature

作者: Yueyu Fu , Travis Bauer , Javed Mostafa , Mathew Palakal , Snehasis Mukhopadhyay

DOI: 10.1145/584931.584953

关键词:

摘要: There is a large and growing body of web accessible biomedical literature. As this electronic literature grows, so does the possibility that document analysis techniques can be used to automatically extract useful information from them, particularly in discovery key concepts dealing with genes, proteins, drugs, diseases associations among these concepts. VCGS (Vocabulary Cluster Generating System) was designed determine tokens subset namely cancer. Such has notable potential automate database construction biomedicine, instead relying on experts' analysis. This paper reports mechanisms for generating clusters tokens. A formal evaluation system, based 5338 Pubmed titles abstracts, been conducted against Swiss-Prot which are entered by experts hand.

参考文章(14)
Thomas C. Rindflesch, Lawrence Hunter, Alan R. Aronson, Mining molecular binding terminology from biomedical text. american medical informatics association annual symposium. pp. 127- 131 ,(1999)
Per Lidén, Lars Asker, Kristofer Franzén, Fredrik Olsson, Gunnar Eriksson, Exploiting Syntax when Detecting Protein Names in Text EFMI Workshop on Natural Language Processing in Biomedical Applications, March 8-9, 2002, Nicosia, Cyprus. ,(2002)
Christopher D. Manning, Hinrich Schütze, Foundations of Statistical Natural Language Processing ,(1999)
K Fukuda, T Tsunoda, A Tamura, T Takagi, Toward information extraction: identifying protein names from biological papers. pacific symposium on biocomputing. pp. 707- 718 ,(1998)
Paul S. Jacobs, Text-based intelligent Systems Psychology Press. ,(1992) , 10.4324/9781315806952
Gerard Salton, Michael J. McGill, Introduction to Modern Information Retrieval ,(1983)
Thomas C. Rindflesch, Jayant V. Rajan, Lawrence Hunter, Extracting Molecular Binding Relationships from Biomedical Text conference on applied natural language processing. pp. 188- 195 ,(2000) , 10.3115/974147.974173
James Thomas, David Milward, Christos Ouzounis, Stephen Pulman, Mark Carroll, Automatic extraction of protein interactions from scientific abstracts. pacific symposium on biocomputing. pp. 541- 552 ,(1999) , 10.1142/9789814447331_0051
William R. Caid, Susan T. Dumais, Stephen I. Gallant, Learned vector-space models for document retrieval text retrieval conference. ,vol. 31, pp. 419- 429 ,(1995) , 10.1016/0306-4573(94)00056-9
Michael W. Berry, Susan T. Dumais, Gavin W. O’Brien, Using Linear Algebra for Intelligent Information Retrieval SIAM Review. ,vol. 37, pp. 573- 595 ,(1995) , 10.1137/1037127