Data fusion: resolving conflicts from multiple sources

作者: Xin Luna Dong , Laure Berti-Equille , Divesh Srivastava

DOI: 10.1007/978-3-642-38562-9_7

关键词: Data qualityData managementSet (abstract data type)Data scienceInformation retrievalCopy detectionSensor fusionComputer scienceScalabilityEnterprise data managementTask (project management)

摘要: Many data management applications, such as setting up Web portals, managing enterprise data, community and sharing scientific require integrating from multiple sources. Each of these sources provides a set values different can often provide conflicting values. To present quality to users, it is critical resolve conflicts discover that reflect the real world; this task called fusion. This paper describes novel approach finds true information when there are large number sources, among which some may copy others. We case study on real-world showing described algorithm significantly improve accuracy truth discovery scalable

参考文章(35)
Reynold Cheng, Managing Quality of Probabilistic Databases Handbook of Data Quality. pp. 271- 291 ,(2013) , 10.1007/978-3-642-36257-6_12
Susan B. Davidson, Juliana Freire, Shawn Bowers, Bertram Ludäscher, Anat Eyal, Manish Kumar Anand, Timothy M. McPhillips, Sarah Cohen Boulakia, Provenance in Scientific Workflow Systems IEEE Data(base) Engineering Bulletin. ,vol. 30, pp. 44- 50 ,(2007)
Deborah L. McGuinness, Paulo Pinheiro da Silva, Rob McCool, Knowledge Provenance Infrastructure. IEEE Data(base) Engineering Bulletin. ,vol. 26, pp. 26- 32 ,(2003)
Jennifer Golbeck, Bijan Parsia, James Hendler, Trust Networks on the Semantic Web cooperative information agents. pp. 238- 249 ,(2006) , 10.1007/978-3-540-45217-1_18
Xian Li, Xin Luna Dong, Kenneth Lyons, Weiyi Meng, Divesh Srivastava, Truth finding on the deep web Proceedings of the VLDB Endowment. ,vol. 6, pp. 97- 108 ,(2012) , 10.14778/2535568.2448943
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti, Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources Notes on Numerical Fluid Mechanics and Multidisciplinary Design. pp. 83- 97 ,(2010) , 10.1007/978-3-642-13094-6_8
Peter Buneman, Sanjeev Khanna, Tan Wang-Chiew, Why and Where: A Characterization of Data Provenance international conference on database theory. pp. 316- 330 ,(2001) , 10.1007/3-540-44503-X_20
Wang-Chiew Tan, Peter Buneman, Sanjeev Khanna, Data Provenance: Some Basic Issues foundations of software technology and theoretical computer science. pp. 87- 93 ,(2000) , 10.1007/3-540-44450-5_6
Dan Roth, Jeff Pasternack, Knowing What to Believe (when you already know something) international conference on computational linguistics. pp. 877- 885 ,(2010)
Susan B. Davidson, Juliana Freire, Provenance and scientific workflows Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08. pp. 1345- 1350 ,(2008) , 10.1145/1376616.1376772