作者: Lynette Hirschman , Alexander A. Morgan , Alexander S. Yeh
DOI: 10.1016/S1532-0464(03)00014-5
关键词:
摘要: As the pace of biological research accelerates, biologists are becoming increasingly reliant on computers to manage information explosion. Biologists communicate their findings by relying precise terms; these terms then provide indices into literature and across growing number databases. This article examines emerging techniques access resources through extraction entity names relations among them. Information has been an active area in natural language processing there promising results for applied news stories, e.g., balanced precision recall 93-95% range identifying person, organization location names. But do not seem transfer directly names, where remain 75-80% range. Multiple factors may be involved, including absence shared training test sets rigorous measures progress, lack annotated data specific tasks, pervasive ambiguity terms, frequent introduction new a mismatch between evaluation tasks as defined real problems. We present evidence from simple lexical matching exercise that illustrates some problems encountered when conclude outlining agenda raise performance named tagging level it can used perform importance.