Information extraction for enhanced access to disease outbreak reports

作者： Ralph Grishman , Silja Huttunen , Roman Yangarber

关键词: Web crawler 、 Information retrieval 、 Outbreak 、 Information extraction 、 World Wide Web 、 Computer science

摘要: Document search is generally based on individual terms in the document. However, for collections within limited domains it possible to provide more powerful access tools. This paper describes a system designed of reports infectious disease outbreaks. The system, Proteus-BIO, automatically creates table outbreaks, with each entry linked document describing that outbreak; this makes use database operations such as selection and sorting find relevant documents. Proteus-BIO consists Web crawler which gathers documents; an information extraction engine converts outbreak events tabular database; browser provides and, through them, uses sets patterns word classes extract about event. Preparing these has been time-consuming manual operation past, but automated discovery tools now make task significantly easier. A small study comparing effectiveness index conventional demonstrated users can substantially documents given time period Proteus-BIO.

参考文章(26)

Ralph Grishman, Roman Yangarber, Silja Huttunen, Diversity of scenarios in information extraction language resources and evaluation. ,(2002)

Ralph Grishman, Roman Yangarber, Customization of information extraction systems ,(1997)

Zellig S. Harris, Linguistic Transformations for Information Retrieval Springer, Dordrecht. pp. 458- 471 ,(1970) , 10.1007/978-94-017-6059-1_24

Joe Zhou, Pascale Fung, Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, : 21-22 June 1999, University of Maryland, College Park, MD, USA Association for Computational Linguistics. ,(1999)

Catherine MacLeod, Ralph Grishman, Adam Meyers, COMLEX syntax : A large syntactic dictionary for natural language processing Computers and The Humanities. ,vol. 31, pp. 459- 481 ,(1997) , 10.1023/A:1001142417369

Ralph Grishman, Roman Yangarber, NYU: Description of the Proteus/PET system as used for MUC-7 ST Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998. ,(1998)

Ralph Grishman, Roman Yangarber, Scenario customization for information extraction New York University. ,(2000)

Margaret S. Lyman, Naomi. Sager, Carol Friedman, Medical Language Processing: Computer Management of Narrative Data ,(1987)

Jerry R. Hobbs, Mabry Tyson, Douglas E. Appelt, David J. Israel, John Bear, FASTUS: A Finite-state Processor for Information Extraction from Real-world Text. international joint conference on artificial intelligence. pp. 1172- 1178 ,(1993)

10.

Tomek Strzalkowski, Jin Wang, A self-learning universal concept spotter Proceedings of the 16th conference on Computational linguistics -. pp. 931- 936 ,(1996) , 10.3115/993268.993329

Information extraction for enhanced access to disease outbreak reports

来源期刊

我的账户

Information extraction for enhanced access to disease outbreak reports

来源期刊

相似文章 10

我的账户