On using high-level structured queries for integrating deep-web information sources

作者: Rafael Corchuelo , Carlos R. Rivero , David Ruiz , Rafael Z. Frantz

DOI:

关键词:

摘要: The actual value of the Deep Web comes from integrating data its applications provide. Such offer human-oriented search forms as their entry points, and there exists a number tools that are used to fill them in retrieve resulting pages programmatically. Solution rely on these usually costly, which motivated researchers work virtual integration, also known metasearch. Virtual integration abstracts away by providing unified form, i.e., programmer fills it system translates into application forms. We argue costs might be reduced further if another abstraction level is provided issuing structured queries high-level languages such SQL, XQuery or SPARQL; this helps abstract As far we know, not proposal literature addresses problem. In paper, propose reference framework called IntegraWeb solve problems using perform deep-web integration. Furthermore, provide comprehensive report existing proposals database research fields, can combination address our problem within previous framework.

参考文章(76)
Philip A. Bernstein, Sergey Melnik, Model management 2.0 Proceedings of the 2007 ACM SIGMOD international conference on Management of data - SIGMOD '07. pp. 1- 12 ,(2007) , 10.1145/1247480.1247482
Alon Halevy, Michael Franklin, David Maier, Principles of dataspace systems symposium on principles of database systems. pp. 1- 9 ,(2006) , 10.1145/1142351.1142352
Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov, Schema mediation for large-scale semantic data sharing very large data bases. ,vol. 14, pp. 68- 83 ,(2005) , 10.1007/S00778-003-0116-Y
Valter Crescenzi, Giansalvatore Mecca, Grammars have exceptions Information Systems. ,vol. 23, pp. 539- 565 ,(1998) , 10.1016/S0306-4379(98)00028-3
Michael Franklin, Alon Halevy, David Maier, From databases to dataspaces: a new abstraction for information management international conference on management of data. ,vol. 34, pp. 27- 33 ,(2005) , 10.1145/1107499.1107502
Xiaoguang Qi, Brian D. Davison, Web page classification ACM Computing Surveys. ,vol. 41, pp. 1- 31 ,(2009) , 10.1145/1459352.1459357
Jiying Wang, Fred H. Lochovsky, Data extraction and label assignment for web databases Proceedings of the twelfth international conference on World Wide Web - WWW '03. pp. 187- 196 ,(2003) , 10.1145/775152.775179
Chia-Hui Chang, Shih-Chien Kuo, Olera: semisupervised Web-data extraction with visual support IEEE Intelligent Systems. ,vol. 19, pp. 56- 64 ,(2004) , 10.1109/MIS.2004.71
Alberto H.F. Laender, Berthier Ribeiro-Neto, Altigran S. da Silva, DEByE - Date extraction by example data and knowledge engineering. ,vol. 40, pp. 121- 154 ,(2002) , 10.1016/S0169-023X(01)00047-7
Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, RoadRunner Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02. pp. 624- 624 ,(2002) , 10.1145/564691.564778