Deep web search: an overview and roadmap

作者: Kien Tjin-Kam-Jet , Djoerd Hiemstra , Rudolf Berend Trieschnigg

DOI:

关键词:

摘要: We review the state-of-the-art in deep web search and propose a novel classification scheme to better compare systems. The current binary (surfacing versus virtual integration) hides number of implicit decisions that must be made by developer. make these explicit distinguishing 7 system aspects describe terms its functionality (what it can, what cannot do) solution specific problem. then motivate need for which has single-field free-text query interface supports real-time structured over multiple sources. To this end, we discuss two possible federated architectures state scientific challenges. Finally, present findings our ongoing project briefly outline related work interfaces data.

参考文章(40)
Michael J. Cafarella, Extracting and Querying a Comprehensive Web Database. conference on innovative data systems research. ,(2009)
Jamie Callan, Distributed Information Retrieval The Information Retrieval Series. ,vol. 5, pp. 127- 150 ,(2002) , 10.1007/0-306-47019-5_5
Qi Zhou, Chong Wang, Miao Xiong, Haofen Wang, Yong Yu, SPARK: adapting keyword query to semantic search international semantic web conference. ,vol. 4825, pp. 694- 707 ,(2007) , 10.1007/978-3-540-76298-0_50
Frank Meng, A natural language interface for information retrieval from forms on the World Wide Web international conference on information systems. pp. 540- 545 ,(1999) , 10.5555/352925.352991
Michael L. Mauldin, W. Mark Boggs, Jaime G. Carbonell, Peter G. Anick, The XCALIBUR project: a natural language interface to expert systems international joint conference on artificial intelligence. pp. 653- 656 ,(1983)
Pengpeng Zhao, Li Huang, Wei Fang, Zhiming Cui, Organizing Structured Deep Web by Clustering Query Interfaces Link Graph advanced data mining and applications. pp. 683- 690 ,(2008) , 10.1007/978-3-540-88192-6_72
Lyublena Antova, Loredana Afanasiev, Alon Y. Halevy, Jayant Madhavan, Harnessing the Deep Web: present and future conference on innovative data systems research. ,(2009)
Fidel Cacheda, Víctor Carneiro, Juan Raposo, Alberto Pan, Manuel Álvarez, Fernando Bellas, DeepBot: a focused crawler for accessing hidden web content electronic commerce. pp. 18- 25 ,(2007) , 10.1145/1278380.1278385
Bin He, Tao Tao, Kevin Chen-Chuan Chang, Organizing structured web sources by query schemas: a clustering approach conference on information and knowledge management. pp. 22- 31 ,(2004) , 10.1145/1031171.1031178
Douglas E. Appelt, Boyan Onyshkevych, THE COMMON PATTERN SPECIFICATION LANGUAGE Proceedings of the TIPSTER Text Program: Phase III. pp. 23- 30 ,(1998) , 10.3115/1119089.1119095