Crowdsourcing for Query Processing on Web Data: A Case Study on the Skyline Operator

作者: Kinda El Maarry , Christoph Lofi , Wolf-Tilo Balke

DOI: 10.2498/CIT.1002509

关键词: Information processingSkyline operatorData miningHuman intelligenceComputer scienceVariety (cybernetics)Information retrievalCrowdsourcingHeuristics

摘要: In recent years, crowdsourcing has become a powerful tool to bring human intelligence into information processing. This is especially important forWeb data which in contrast well-maintained databases almost always incomplete and may be distributed over variety of sources. Crowdsourcing allows tackle many problems are not yet attainable using machine-based algorithms alone: particular, it perform database operators on as workers can used provide values during runtime. As this costly quickly, elaborate optimization required. paper, we showcase how such optimizations performed for the popular skyline operator preference queries. We present some heuristics-based approaches compare them crowdsourcing-based sophisticated techniques while focusing result correctness.

参考文章(21)
David Martin Ward Powers, None, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation arXiv: Learning. ,vol. 2, pp. 37- 63 ,(2011)
Christoph Lofi, Kinda El Maarry, Wolf-Tilo Balke, Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing international conference on conceptual modeling. pp. 298- 312 ,(2013) , 10.1007/978-3-642-41924-9_25
Edgar Acuña, Caroline Rodriguez, The Treatment of Missing Values and its Effect on Classifier Accuracy Springer, Berlin, Heidelberg. pp. 639- 647 ,(2004) , 10.1007/978-3-642-17103-1_60
Parke Godfrey, Skyline Cardinality for Relational Processing foundations of information and knowledge systems. pp. 78- 97 ,(2004) , 10.1007/978-3-540-24627-5_7
Michael F. Goodchild, J. Alan Glennon, Crowdsourcing geographic information for disaster response: a research frontier International Journal of Digital Earth. ,vol. 3, pp. 231- 241 ,(2010) , 10.1080/17538941003759255
M. Six Silberman, Lilly Irani, Joel Ross, Ethics and tactics of professional crowdwork ACM Crossroads Student Magazine. ,vol. 17, pp. 39- 43 ,(2010) , 10.1145/1869086.1869100
Akrivi Vlachou, Michalis Vazirgiannis, Ranking the sky: Discovering the importance of skyline points through subspace dominance relationships data and knowledge engineering. ,vol. 69, pp. 943- 964 ,(2010) , 10.1016/J.DATAK.2010.03.008
Petros Venetis, Hector Garcia-Molina, Quality control for comparison microtasks Proceedings of the First International Workshop on Crowdsourcing and Data Mining - CrowdKDD '12. pp. 15- 21 ,(2012) , 10.1145/2442657.2442660
Eric Horvitz, Ece Kamar, Incentives for truthful reporting in crowdsourcing adaptive agents and multi-agents systems. pp. 1329- 1330 ,(2012) , 10.5555/2343896.2343988
Mohamed E. Khalefa, Mohamed F. Mokbel, Justin J. Levandoski, Skyline Query Processing for Incomplete Data 2008 IEEE 24th International Conference on Data Engineering. pp. 556- 565 ,(2008) , 10.1109/ICDE.2008.4497464