A probabilistic ranking framework for web-based relational data imputation

作者: Zhaoqiang Chen , Qun Chen , Jiajun Li , Zhanhuai Li , Lei Chen

DOI: 10.1016/J.INS.2016.03.036

关键词:

摘要: Due to richness of information on web, there is an increasing interest search for missing attribute values in relational data web. Web-based imputation has first extract multiple candidate from web and then rank them by their matching probabilities. However, effective ranking remains challenging because documents are unstructured popular engines can only provide with relevant but not necessarily semantically information.In this paper, we propose a novel probabilistic approach the web-retrieved values. It integrate various influence factors, e.g. snippet order, occurrence frequency, pattern, keyword proximity, single framework semantic reasoning. The proposed consists model model. measures snippet, similarity between value tuple. We also present estimation solutions both models. Finally, empirically evaluate performance real datasets. Our extensive experiments demonstrate that it outperforms state-of-the-art techniques considerable margins accuracy.

参考文章(32)
Shichao Zhang, Jilian Zhang, Xiaofeng Zhu, Yongsong Qin, Chengqi Zhang, Missing value imputation based on data clustering trans. computational science. ,vol. 1, pp. 128- 138 ,(2008) , 10.1007/978-3-540-79299-4_7
J.J. Hox, A review of current software for handling missing data Kwantitatieve Methoden. ,vol. 20, pp. 123- 138 ,(1999)
Maria Carolina Monard, Gustavo E. A. P. A. Batista, A Study of K-Nearest Neighbour as an Imputation Method. HIS. pp. 251- 260 ,(2002)
Andrei Mikheev, Marc Moens, Claire Grover, Named Entity recognition without gazetteers Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -. pp. 1- 8 ,(1999) , 10.3115/977035.977037
Charles L. A. Clarke, Gordon V. Cormack, Thomas R. Lynam, Exploiting redundancy in question answering international acm sigir conference on research and development in information retrieval. pp. 358- 365 ,(2001) , 10.1145/383952.384024
Boris Katz, Jimmy Lin, D Loreto, W Hildebrandt, M Bilotti, S Felshin, A Fernandes, G Marton, F Mora, Question answering from the web using knowledge annotation and knowledge mining techniques conference on information and knowledge management. pp. 116- 123 ,(2003) , 10.1145/956863.956886
Gao Shu, Omer F. Rana, Nick J. Avis, Chen Dingfang, Ontology-based semantic matchmaking approach Advances in Engineering Software. ,vol. 38, pp. 59- 67 ,(2007) , 10.1016/J.ADVENGSOFT.2006.05.004
Ghassan Beydoun, Graham Low, Francisco García-Sánchez, Rafael Valencia-García, Rodrigo Martínez-Béjar, Identification of ontologies to support information systems development Information Systems. ,vol. 46, pp. 45- 60 ,(2014) , 10.1016/J.IS.2014.05.002
Zhixu Li, Mohamed A. Sharaf, Laurianne Sitbon, Xiaoyong Du, Xiaofang Zhou, CoRE: A Context-Aware RelationExtraction Method for Relation Completion IEEE Transactions on Knowledge and Data Engineering. ,vol. 26, pp. 836- 849 ,(2014) , 10.1109/TKDE.2013.148