Schema matching prediction with applications to data source discovery and dynamic ensembling

作者: Tomer Sagi , Avigdor Gal

DOI: 10.1007/S00778-013-0325-Y

关键词:

摘要: Web-scale data integration involves fully automated efforts which lack knowledge of the exact match between descriptions. In this paper, we introduce schema matching prediction, an assessment mechanism to support matchers in absence match. Given attribute pair-wise similarity measures, a predictor predicts success matcher identifying correct correspondences. We present comprehensive framework predictors can be defined, designed, and evaluated. formally define evaluation prediction using spaces discuss set four desirable properties predictors, namely correlation, robustness, tunability, generalization. method for constructing supporting generalization, models as means tuning toward various quality measures. empirical correlation robustness provide concrete measures their evaluation. illustrate usefulness by presenting three use cases: propose ranking relevance deep Web sources with respect given user needs. show how assist design systems. Finally, dynamic weight setting ensemble, thus improving upon current state-of-the-art methods. An extensive shows these cases demonstrates increasing performance matching.

参考文章(58)
Ming Mao, Michael Spring, Yefei Peng, A Harmony based Adaptive Ontology Mapping Approach SWWS. pp. 336- 342 ,(2008)
Shirley Cohen, Shawn R. Jeffery, David Ko, Alon Halevy, Xin (Luna) Dong, Jayant Madhavan, Cong Yu, Web-scale Data Integration: You can only afford to Pay As You Go conference on innovative data systems research. pp. 342- 350 ,(2007)
Matteo Magnani, Nikos Rizopoulos, Peter Mc.Brien, Danilo Montesi, Schema integration based on uncertain semantic mappings international conference on conceptual modeling. ,vol. 3716, pp. 31- 46 ,(2005) , 10.1007/11568322_3
Hasan Jamil, Giovanni Modica, Avigdor Gal, Ami Eyal, Automatic ontology matching using application semantics Ai Magazine. ,vol. 26, pp. 21- 31 ,(2005) , 10.1609/AIMAG.V26I1.1796
P.A. Bernstein, S. Melnik, Meta data management international conference on data engineering. pp. 875- 875 ,(2004) , 10.1109/ICDE.2004.1320101
Jiying Wang, Ji-Rong Wen, Fred Lochovsky, Wei-Ying Ma, Instance-based schema matching for web databases by domain-specific query probing very large data bases. pp. 408- 419 ,(2004) , 10.1016/B978-012088469-8.50038-3
David G. Luenberger, Optimization by Vector Space Methods ,(1968)
Xia Yang, Mong Li Lee, Tok Wang Ling, Resolving Structural Conflicts in the Integration of XML Schemas: A Semantic Approach international conference on conceptual modeling. pp. 520- 533 ,(2003) , 10.1007/978-3-540-39648-2_40