Domain-oriented Deep Web Data Sources' Discovery and Identification

作者： Yingjun Li , Tiezheng Nie , Derong Shen , Ge Yu

DOI: 10.1109/APWEB.2010.54

关键词: Social Semantic Web 、 Data mining 、 Information retrieval 、 Web modeling 、 Web intelligence 、 Computer science 、 Web search query 、 Data Web 、 Semantic Web Stack 、 Web query classification 、 Semantic similarity

摘要: As Deep Web contains tremendous well-structured data sources, how to integrate sources in has become a hotspot current research. Accurately discovering and identifying related specific domain key issues. We propose Domain-Oriented source Discovery method (DO-DWD) novel Domain Identification strategy of (DIDW). In the discovery stage, we use machine learning algorithms some heuristic rules find query interfaces sources; identification identify associated with by calculating relevance between interface based on semantic similarity. Finally, have extensive experiments real set showing that DO-DWD DIDW are high correctness accuracy.

参考文章(5)

Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld, A scalable comparison-shopping agent for the World-Wide Web adaptive agents and multi-agents systems. pp. 39- 48 ,(1997) , 10.1145/267658.267666

M. K. Bergman, The deep web : Surfacing hidden value J. Electronic Publishing, the University of Michigan. ,(2001)

A. Bergholz, B. Childlovskii, Crawling for domain-specific hidden Web resources web information systems engineering. pp. 125- 133 ,(2003) , 10.1109/WISE.2003.1254476

Y. Hedley, M. Younas, A. James, The categorisation of hidden Web databases through concept specificity and coverage advanced information networking and applications. ,vol. 2, pp. 671- 676 ,(2005) , 10.1109/AINA.2005.323

Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami, Probe, count, and classify: categorizing hidden web databases international conference on management of data. ,vol. 30, pp. 67- 78 ,(2001) , 10.1145/375663.375671

Domain-oriented Deep Web Data Sources' Discovery and Identification

来源期刊

我的账户

Domain-oriented Deep Web Data Sources' Discovery and Identification

来源期刊

相似文章 3

Automatic discovery of Web Query Interfaces using machine learning techniques

E-FFC: an enhanced form-focused crawler for domain-specific deep web databases

A Multidomain Layered Approach in Development of Industrial Ontology to Support Domain Identification for Unstructured Text

我的账户