Cost Reduction for Web-Based Data Imputation

作者: Zhixu Li , Shuo Shang , Qing Xie , Xiangliang Zhang

DOI: 10.1007/978-3-319-05813-9_29

关键词:

摘要: Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from Web. In particular, complete fields can be used as keywords in imputation queries for fields. However, due to ambiguity these and complexity on Web, different may retrieve answers same value. To decide most probable right answer each filed value, existing method issues quite a few available then vote deciding answer. As result, we have issue large number filling all an set, which brings overhead. this paper, work reducing cost two aspects: First, propose query execution scheme secure value issuing possible. Second, recognize prune that probably will fail return any priori. Our extensive experimental evaluation shows our proposed techniques substantially reduce without hurting its high accuracy.

参考文章(30)
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Zhixu Li, Mohamed A. Sharaf, Laurianne Sitbon, Shazia Sadiq, Marta Indulska, Xiaofang Zhou, WebPut: efficient web-based data imputation web information systems engineering. ,vol. 7651, pp. 243- 256 ,(2012) , 10.1007/978-3-642-35063-4_18
Luis Gravano, Eugene Agichtein, Extracting Relations from Large Plain-Text Collections Department of Computer Science, Columbia University. ,(1999) , 10.7916/D8NG4ZHK
Chih-Hung Wu, Chian-Huei Wun, Hung-Ju Chou, Using association rules for completing missing data international conference hybrid intelligent systems. pp. 236- 241 ,(2004) , 10.1109/ICHIS.2004.91
Shichao Zhang, Shell-neighbor method and its application in missing data imputation Applied Intelligence. ,vol. 35, pp. 123- 133 ,(2011) , 10.1007/S10489-009-0207-6
Shichao Zhang, Nearest neighbor selection for iteratively kNN imputation Journal of Systems and Software. ,vol. 85, pp. 2541- 2552 ,(2012) , 10.1016/J.JSS.2012.05.073
Gustavo E. A. P. A. Batista, Maria Carolina Monard, An analysis of four missing data treatment methods for supervised learning Applied Artificial Intelligence. ,vol. 17, pp. 519- 533 ,(2003) , 10.1080/713827181
Xiaofeng Zhu, Shichao Zhang, Zhi Jin, Zili Zhang, Zhuoming Xu, Missing Value Estimation for Mixed-Attribute Data Sets IEEE Transactions on Knowledge and Data Engineering. ,vol. 23, pp. 110- 121 ,(2011) , 10.1109/TKDE.2010.99
Surajit Chaudhuri, What next? Proceedings of the 31st symposium on Principles of Database Systems - PODS '12. pp. 1- 4 ,(2012) , 10.1145/2213556.2213558