A deep web query interface discovery method

作者: Bo Liu , Zhenxing Li

DOI: 10.1109/FSKD.2015.7382138

关键词:

摘要: For the purpose of obtaining deep web query interface from forms accurately, this paper proposes a framework automatic discovery, which includes procedures collecting pages, extracting and features, filtering forms, identifying forms. A heuristic rule-based k-nearest neighbor algorithm for interfaces is introduced. In experiments, number non-query different domains are selected classifying interfaces. Experimental results demonstrate that presented can significantly improve accuracy discovery.

参考文章(21)
Juliano Palmieri Lage, Altigran S. da Silva, Paulo B. Golgher, Alberto H.F. Laender, Automatic generation of agents for collecting hidden web pages for data extraction data and knowledge engineering. ,vol. 49, pp. 177- 196 ,(2004) , 10.1016/J.DATAK.2003.10.003
Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clement Yu, Annotating Search Results from Web Databases IEEE Transactions on Knowledge and Data Engineering. ,vol. 25, pp. 514- 527 ,(2013) , 10.1109/TKDE.2011.175
Ying Wang, Huilai Li, Wanli Zuo, Fengling He, Xin Wang, Kerui Chen, Research on discovering deep web entries Computer Science and Information Systems. ,vol. 8, pp. 779- 799 ,(2011) , 10.2298/CSIS100322028W
B. Liu, C. Wan, L. Wang, An efficient semi-unsupervised gene selection method via spectral biclustering IEEE Transactions on Nanobioscience. ,vol. 5, pp. 110- 114 ,(2006) , 10.1109/TNB.2006.875040
Jens Lehmann, Tim Furche, Giovanni Grasso, Axel-Cyrille Ngonga Ngomo, Christian Schallhart, Andrew Sellers, Christina Unger, Lorenz Bühmann, Daniel Gerber, Konrad Höffner, David Liu, Sören Auer, DEQA: deep web extraction for question answering international semantic web conference. pp. 131- 147 ,(2012) , 10.1007/978-3-642-35173-0_9
R. Baumgartner, M. Ceresna, G. Gottlob, M. Herzog, V. Zigo, Web information acquisition with Lixto Suite: a demonstration international conference on data engineering. pp. 747- 749 ,(2003) , 10.1109/ICDE.2003.1260855
Tim Furche, Georg Gottlob, Giovanni Grasso, Christian Schallhart, Andrew Sellers, OXPath: A language for scalable data extraction, automation, and crawling on the deep web very large data bases. ,vol. 22, pp. 47- 72 ,(2013) , 10.1007/S00778-012-0286-6
Zhen Zhang, Bin He, Kevin Chen-Chuan Chang, Understanding Web query interfaces: best-effort parsing with hidden syntax international conference on management of data. pp. 107- 118 ,(2004) , 10.1145/1007568.1007583
M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn, A.K. Jain, Dimensionality reduction using genetic algorithms IEEE Transactions on Evolutionary Computation. ,vol. 4, pp. 164- 171 ,(2000) , 10.1109/4235.850656