作者: Sarath Kumar Kondreddi , Peter Triantafillou , Gerhard Weikum
DOI: 10.1109/ICDE.2014.6816717
关键词:
摘要: Automatic information extraction (IE) enables the construction of very large knowledge bases (KBs), with relational facts on millions entities from text corpora and Web sources. However, such KBs contain errors they are far being complete. This motivates need for exploiting human intelligence using crowd-based computing (HC) assessing validity gathering additional knowledge. paper presents a novel system architecture, called Higgins, which shows how to effectively integrate an IE engine HC engine. Higgins generates game questions where players choose or fill in missing relations subject-relation-object triples. For generating multiple-choice answer candidates, we have constructed dictionary entity names phrases, developed specifically designed statistical language models phrase relatedness. To this end, combine semantic resources like WordNet, ConceptNet, others statistics derived largeWeb corpus. We demonstrate effectiveness acquisition by crowdsourced relationships between characters narrative descriptions movies books.