CrowdScreen

作者： Aditya G. Parameswaran , Hector Garcia-Molina , Hyunjung Park , Neoklis Polyzotis , Aditya Ramesh

关键词: Set (abstract data type) 、 Computer science 、 Artificial intelligence 、 Heuristics 、 Probabilistic analysis of algorithms 、 Large set (Ramsey theory) 、 State (computer science) 、 Machine learning 、 Algorithm 、 Variety (cybernetics) 、 Theoretical computer science 、 Crowdsourcing

摘要: Given a large set of data items, we consider the problem filtering them based on properties that can be verified by humans. This is commonplace in crowdsourcing applications, and yet, to our knowledge, no one has considered formal optimization this problem. (Typical solutions use heuristics solve problem.) We formally state few different variants develop deterministic probabilistic algorithms optimize expected cost (i.e., number questions) error. experimentally show provide definite gains with respect other strategies. Our applied variety scenarios form an integral part any query processor uses human computation.

参考文章(21)

Robert C. Miller, Samuel R. Madden, Eugene Wu, Adam Marcus, David R. Karger, Crowdsourced Databases: Query Processing with People conference on innovative data systems research. pp. 211- 214 ,(2011)

Neoklis Polyzotis, Aditya G. Parameswaran, Answering Queries using Humans, Algorithms and Databases conference on innovative data systems research. pp. 160- 166 ,(2011)

Eytan Bakshy, Jake M. Hofman, Winter A. Mason, Duncan J. Watts, Everyone's an influencer: quantifying influence on twitter web search and data mining. pp. 65- 74 ,(2011) , 10.1145/1935826.1935845

Rion Snow, Brendan O'Connor, Daniel Jurafsky, Andrew Y. Ng, Cheap and fast---but is it good? Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP '08. pp. 254- 263 ,(2008) , 10.3115/1613715.1613751

Vikas C. Raykar, Shipeng Yu, Linda H. Zhao, Anna Jerebko, Charles Florin, Gerardo Hermosillo Valadez, Luca Bogoni, Linda Moy, Supervised learning from multiple experts Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. pp. 889- 896 ,(2009) , 10.1145/1553374.1553488

Anhai Doan, Raghu Ramakrishnan, Alon Y. Halevy, Crowdsourcing systems on the World-Wide Web Communications of the ACM. ,vol. 54, pp. 86- 96 ,(2011) , 10.1145/1924421.1924442

Robert McCann, Warren Shen, AnHai Doan, Matching Schemas in Online Communities: A Web 2.0 Approach international conference on data engineering. pp. 110- 119 ,(2008) , 10.1109/ICDE.2008.4497419

Adam Marcus, Eugene Wu, David R. Karger, Samuel Madden, Robert C. Miller, Demonstration of Qurk Proceedings of the 2011 international conference on Management of data - SIGMOD '11. pp. 1315- 1318 ,(2011) , 10.1145/1989323.1989486

Aditya Parameswaran, Anish Das Sarma, Hector Garcia-Molina, Neoklis Polyzotis, Jennifer Widom, Human-assisted graph search Proceedings of the VLDB Endowment. ,vol. 4, pp. 267- 278 ,(2011) , 10.14778/1952376.1952377

10.

Victor S. Sheng, Foster Provost, Panagiotis G. Ipeirotis, Get another label? improving data quality and data mining using multiple, noisy labelers Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 614- 622 ,(2008) , 10.1145/1401890.1401965

CrowdScreen

来源期刊

我的账户

CrowdScreen

来源期刊

相似文章 10

我的账户