Maximal fusion of facts on the web with credibility guarantee

作者: Thanh Tam Nguyen , Thanh Cong Phan , Quoc Viet Hung Nguyen , Karl Aberer , Bela Stantic

DOI: 10.1016/J.INFFUS.2018.07.009

关键词: Openness to experienceMisinformationConvergence (routing)Computer scienceMaximal setWorld Wide WebProcess (engineering)Knowledge extractionCredibilityStatistical model

摘要: Abstract The Web became the central medium for valuable sources of information fusion applications. However, such user-generated resources are often plagued by inaccuracies and misinformation as a result inherent openness uncertainty Web. While finding objective data is non-trivial, assessing their credibility with high confidence even harder due to conflicts between sources. In this work, we consider novel setting fusing factual from guarantee maximal recall. ultimate goal that not only should be extracted much possible but also its must satisfy threshold requirement. To end, formulate problem instantiating set precision larger than pre-defined threshold. Our proposed approach learning process optimize parameters probabilistic model captures relationships sources, contents, underlying information. automatically searches best without pre-trained data. Upon convergence, used instantiate guarantee. evaluations real-world datasets show our outperforms baselines up 6 times.

参考文章(69)
Burr Settles, Active Learning ,(2012)
Benjamin I. P. Rubinstein, Jim Gemmell, Jiawei Han, Bo Zhao, A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration arXiv: Databases. ,(2012)
Alexandra Olteanu, Stanislav Peshterliev, Xin Liu, Karl Aberer, Web credibility: features exploration and credibility prediction european conference on information retrieval. pp. 557- 568 ,(2013) , 10.1007/978-3-642-36973-5_47
José Antonio Iglesias, Alexandra Tiemblo, Agapito Ledezma, Araceli Sanchis, Web news mining in an evolving framework Information Fusion. ,vol. 28, pp. 90- 98 ,(2016) , 10.1016/J.INFFUS.2015.07.004
Andrew McCallum, Kedar Bellare, Fernando Pereira, A conditional random field for discriminatively-trained finite-state string edit distance uncertainty in artificial intelligence. pp. 388- 395 ,(2005) , 10.21236/ADA440386
Zhe Zhao, Paul Resnick, Qiaozhu Mei, Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts the web conference. pp. 1395- 1405 ,(2015) , 10.1145/2736277.2741637
Jeff Pasternack, Dan Roth, Latent credibility analysis Proceedings of the 22nd international conference on World Wide Web - WWW '13. pp. 1009- 1020 ,(2013) , 10.1145/2488388.2488476
Carmen De Maio, Giuseppe Fenza, Vincenzo Loia, Mimmo Parente, Time Aware Knowledge Extraction for microblog summarization on Twitter Information Fusion. ,vol. 28, pp. 60- 74 ,(2016) , 10.1016/J.INFFUS.2015.06.004
Subhabrata Mukherjee, Gerhard Weikum, Cristian Danescu-Niculescu-Mizil, People on drugs: credibility of user statements in health communities knowledge discovery and data mining. pp. 65- 74 ,(2014) , 10.1145/2623330.2623714
Bianca Zadrozny, Charles Elkan, Learning and making decisions when costs and probabilities are both unknown knowledge discovery and data mining. pp. 204- 213 ,(2001) , 10.1145/502512.502540