作者: Nona Naderi
DOI:
关键词:
摘要: Mutations as sources of evolution have long been the focus attention in the biomedical literature. Accessing mutational information and their impacts on protein properties facilitates research various domains, such as enzymology pharmacology. However, manually reading through rich fast growing repository of biomedical literature is expensive time-consuming. A number curated databases, BRENDA (http://www.brenda-enzymes.org), try to index provide this information; yet provided data seems be incomplete. Thus, there a growing need for automated approaches extract this information. In work, we present a system automatically summarize impact information from mutations. Our extraction module split into subtasks: organism analysis, mutation detection, property impact analysis. Organisms, proteins, are required extracted to help disambiguation genes proteins. our extracts and grounds organisms NCBI. We detect mutation series correctly ground detected impacts. Our also affected well magnitude the effects. The output populated an OWL-DL ontology, which can then queried structured information. The performance evaluated on both external internal corpora and databases. results show reliability approaches. Organism achieves precision recall 95% and 94% grounding accuracy 97.5% OT corpus. On manually annotated corpus Linneaus-100, of 99% 97% with 97.4%. In impact detection task, of 70.4%-71.8% 71.2%-71.3% annotated documents. grounds detected impacts 70.1%-71.7% documents and 57%-57.5% 82.5%-84.2% against data.