作者: Nikos Mamoulis , Yudian Zheng , Reynold Cheng , Guoliang Li , Zhipeng Huang
DOI:
关键词: Computer science 、 Task (project management) 、 Inference 、 Sentiment analysis 、 Information retrieval 、 Set (abstract data type) 、 Order (business) 、 Categorical variable 、 Data mining 、 Star (graph theory) 、 Crowdsourcing
摘要: Crowdsourcing employs human workers to solve computer-hard problems, such as data cleaning, entity resolution, and sentiment analysis. When crowdsourcing tabular data, e.g., the attribute values of an set, a worker's answers on different attributes (e.g., nationality age celebrity star) are often treated independently. This assumption is not always true can lead suboptimal performance. In this paper, we present T-Crowd system, which takes into consideration intricate relationships among tasks, in order converge faster their values. Particularly, integrates each effectively learn his/her trustworthiness The relationship information also used guide task allocation workers. Finally, seamlessly supports categorical continuous attributes, two main datatypes found typical databases. Our extensive experiments real synthetic datasets show that outperforms state-of-the-art methods terms truth inference reducing cost crowdsourcing.