CrowdGame: A Game-Based Crowdsourcing System for Cost-Effective Data Labeling

作者: Tongyu Liu , Jingru Yang , Ju Fan , Zhewei Wei , Guoliang Li

DOI: 10.1145/3299869.3320221

关键词:

摘要: Large-scale data labeling has become a major bottleneck for many applications, such as machine learning and integration. This paper presents CrowdGame, crowdsourcing system that harnesses the crowd to gather labels in cost-effective way. CrowdGame focuses on generating high-quality rules largely reduce cost while preserving quality. It first generates candidate rules, then devises game-based approach select with high coverage accuracy. applies generated effective labeling. We have implemented provided user-friendly interface users deploy their applications. will demonstrate two representative scenarios, entity matching relation extraction.

参考文章(7)
Ju Fan, Guoliang Li, Beng Chin Ooi, Kian-lee Tan, Jianhua Feng, iCrowd: An Adaptive Crowdsourcing Framework international conference on management of data. pp. 1015- 1030 ,(2015) , 10.1145/2723372.2750550
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei, ImageNet: A large-scale hierarchical image database computer vision and pattern recognition. pp. 248- 255 ,(2009) , 10.1109/CVPR.2009.5206848
Dawei Gao, Yongxin Tong, Jieying She, Tianshu Song, Lei Chen, Ke Xu, Top- k Team Recommendation and Its Variants in Spatial Crowdsourcing Data Science and Engineering. ,vol. 2, pp. 136- 150 ,(2017) , 10.1007/S41019-017-0037-1
Lei Chen, Zimu Zhou, H. V. Jagadish, Lidan Shou, Weifeng Lv, Yongxin Tong, SLADE: A Smart Large-Scale Task Decomposer in Crowdsourcing IEEE Transactions on Knowledge and Data Engineering. ,vol. 30, pp. 1588- 1601 ,(2018) , 10.1109/TKDE.2018.2797962
Jingru Yang, Ju Fan, Zhewei Wei, Guoliang Li, Tongyu Liu, Xiaoyong Du, Cost-effective data annotation using game-based crowdsourcing Proceedings of the VLDB Endowment. ,vol. 12, pp. 57- 70 ,(2018) , 10.14778/3275536.3275541
Paroma Varma, Christopher Ré, Snuba Proceedings of the VLDB Endowment. ,vol. 12, pp. 223- 236 ,(2018) , 10.14778/3291264.3291268
Alexander Ratner, Stephen H. Bach, Henry Ehrenberg, Jason Fries, Sen Wu, Christopher Ré, Snorkel: rapid training data creation with weak supervision very large data bases. ,vol. 11, pp. 269- 282 ,(2017) , 10.14778/3157794.3157797