Crowdsourced Data Management: Overview and Challenges

作者: Guoliang Li , Yudian Zheng , Ju Fan , Jiannan Wang , Reynold Cheng

DOI: 10.1145/3035918.3054776

关键词:

摘要: Many important data management and analytics tasks cannot be completely addressed by automated processes. Crowdsourcing is an effective way to harness human cognitive abilities process these computer-hard tasks, such as entity resolution, sentiment analysis, image recognition. Crowdsourced has been extensively studied in research industry recently. In this tutorial, we will survey synthesize a wide spectrum of existing studies on crowdsourced management. We first give overview crowdsourcing, then summarize the fundamental techniques, including quality control, cost latency which must considered Next review operators, selection, collection, join, top-k, sort, categorize, aggregation, skyline, planning, schema matching, mining spatial crowdsourcing. also discuss crowdsourcing optimization techniques systems. Finally, provide emerging challenges.

参考文章(105)
A. P. Dawid, A. M. Skene, Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 28, pp. 20- 28 ,(1979) , 10.2307/2346806
Christoph Lofi, Kinda El Maarry, Wolf-Tilo Balke, Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing international conference on conceptual modeling. pp. 298- 312 ,(2013) , 10.1007/978-3-642-41924-9_25
Zoubin Ghahramani, Hyun-Chul Kim, Bayesian Classifier Combination international conference on artificial intelligence and statistics. pp. 619- 627 ,(2012)
Vasilis Verroios, Hector Garcia-Molina, Entity Resolution with crowd errors international conference on data engineering. pp. 219- 230 ,(2015) , 10.1109/ICDE.2015.7113286
Chien-Ju Ho, Jennifer Wortman Vaughan, Shahin Jabbari, Adaptive Task Assignment for Crowdsourced Classification international conference on machine learning. pp. 534- 542 ,(2013)
Robert C. Miller, Samuel R. Madden, Eugene Wu, Adam Marcus, David R. Karger, Crowdsourced Databases: Query Processing with People conference on innovative data systems research. pp. 211- 214 ,(2011)
Lei Chen, Dongwon Lee, Tova Milo, Data-driven crowdsourcing: Management, mining, and applications international conference on data engineering. pp. 1527- 1529 ,(2015) , 10.1109/ICDE.2015.7113418
Ashish Gupta, Neoklis Polyzotis, Jennifer Widom, Aditya Parameswaran, Stephen Boyd, Hector Garcia-Molina, Optimal crowd-powered rating and filtering algorithms Proceedings of the VLDB Endowment. ,vol. 7, pp. 685- 696 ,(2014) , 10.14778/2732939.2732942
Antti Ukkonen, Hannes Heikinheimo, The Crowd-Median Algorithm. national conference on artificial intelligence. ,(2013)
Adam Marcus, David Karger, Samuel Madden, Robert Miller, Sewoong Oh, None, Counting with the crowd Proceedings of the VLDB Endowment. ,vol. 6, pp. 109- 120 ,(2012) , 10.14778/2535568.2448944