作者: Xin Luna Dong , Laure Berti-Equille , Divesh Srivastava
DOI: 10.1007/978-3-642-38562-9_7
关键词: Data quality 、 Data management 、 Set (abstract data type) 、 Data science 、 Information retrieval 、 Copy detection 、 Sensor fusion 、 Computer science 、 Scalability 、 Enterprise data management 、 Task (project management)
摘要: Many data management applications, such as setting up Web portals, managing enterprise data, community and sharing scientific require integrating from multiple sources. Each of these sources provides a set values different can often provide conflicting values. To present quality to users, it is critical resolve conflicts discover that reflect the real world; this task called fusion. This paper describes novel approach finds true information when there are large number sources, among which some may copy others. We case study on real-world showing described algorithm significantly improve accuracy truth discovery scalable