作者: Ginger Tsueng , Steven M. Nanis , Jennifer Fouquier , Benjamin M Good , Andrew I Su
DOI: 10.1101/038083
关键词:
摘要: Biomedical literature represents one of the largest and fastest growing collections unstructured biomedical knowledge. Finding critical information buried in can be challenging. In order to extract from freeflowing text, researchers need to: 1. identify entities text (named entity recognition), 2. apply a standardized vocabulary these (normalization), 3. how are related another (relationship extraction.) Researchers have primarily approached extraction tasks through manual expert curation, computational methods. We previously demonstrated that named recognition (NER) crowdsourced group nonexperts via paid microtask platform, Amazon Mechanical Turk (AMT); dramatically reduce cost increase throughput biocuration efforts. However, given size even platforms is not scalable. With our web-based application Mark2Cure ( http://mark2cure.org ), we demonstrate NER also performed by volunteer citizen scientists with high accuracy. metrics Zooniverse Matrices Citizen Science Success provide results here serve as basis comparison for other science projects. Further, discuss design considerations, issues, analytics successfully moving crowdsourcing workflow platform platform. To knowledge, this study first natural language processing task.