Variable Length-Based Genetic Representation to Automatically Evolve Wrappers

作者: David F. Barrero , Antonio González , María D. R-Moreno , David Camacho

DOI: 10.1007/978-3-642-12433-4_44

关键词:

摘要: The Web has been the star service on Internet, however outsized information available and its decentralized nature originated an intrinsic difficulty to locate, extract compose information. An automatic approach is required handle with this huge amount of data. In paper we present a machine learning algorithm based Genetic Algorithms which generates set complex wrappers, able from theWeb. presents experimental evaluation these wrappers over basic data sets.

参考文章(16)
Jeffrey E. F. Friedl, Mastering Regular Expressions O'Reilly & Associates, Inc.. ,(1997)
David F. Barrero, David Camacho, María D. R-Moreno, Automatic Web Data Extraction Based on Genetic Algorithms and Regular Expressions Data Mining and Multi-agent Integration. pp. 143- 154 ,(2009) , 10.1007/978-1-4419-0522-2_9
M. Dolores R-Moreno, David F. Barrero, Angel Moreno, Oscar García, SEARCHY: a metasearch engine for heterogeneous sources in distributed environments international conference on dublin core and metadata applications. pp. 235- 238 ,(2005)
Connie Loggia Ramsey, Kenneth A. De Jong, John J. Grefenstettc, Annie S. Wu, Donald S. Burke, Genome length as an evolutionary self-adaptation Lecture Notes in Computer Science. pp. 345- 353 ,(1998) , 10.1007/BFB0056877
Maria D R-Moreno, David F Barrero, David Camacho, Semantic Wrappers for Semi-Structured Data Extraction 1 ,(2008)
E Mark Gold, Language identification in the limit Information & Computation. ,vol. 10, pp. 447- 474 ,(1967) , 10.1016/S0019-9958(67)91165-5