Olera: semisupervised Web-data extraction with visual support

作者: Chia-Hui Chang , Shih-Chien Kuo

DOI: 10.1109/MIS.2004.71

关键词:

摘要: Olera is a semisupervised information-extraction system that produces extraction rules from semistructured Web documents without requiring detailed annotation of the training …

参考文章(10)
Nicholas Kushmerick, Daniel S. Weld, Wrapper induction for information extraction international joint conference on artificial intelligence. pp. 729- 737 ,(1997)
Alberto H. F. Laender, Berthier A. Ribeiro-Neto, Altigran S. da Silva, Juliana S. Teixeira, A brief survey of web data extraction tools ACM SIGMOD Record. ,vol. 31, pp. 84- 93 ,(2002) , 10.1145/565117.565137
Chun-Nan Hsu, Ming-Tzung Dung, Generating finite-state transducers for semi-structured data extraction from the Web Information Systems. ,vol. 23, pp. 521- 538 ,(1998) , 10.1016/S0306-4379(98)00027-1
Chia-Hui Chang, Chun-Nan Hsu, Shao-Cheng Lui, Automatic information extraction from semi-structured Web pages by pattern discovery decision support systems. ,vol. 35, pp. 129- 147 ,(2003) , 10.1016/S0167-9236(02)00100-8
D GUSFIELD, Efficient methods for multiple sequence alignment with guaranteed error bounds. Bulletin of Mathematical Biology. ,vol. 55, pp. 141- 154 ,(1993) , 10.1016/S0092-8240(05)80066-7
Paolo Merialdo, Valter Crescenzi, Giansalvatore Mecca, RoadRunner: Towards Automatic Data Extraction from Large Web Sites very large data bases. pp. 109- 118 ,(2001)
Arvind Arasu, Hector Garcia-Molina, Stanford University, Extracting structured data from Web pages international conference on management of data. pp. 337- 348 ,(2003) , 10.1145/872757.872799
Guizhen Yang, I. V. Ramakrishnan, Michael Kifer, On the complexity of schema inference from web pages in the presence of nullable data attributes conference on information and knowledge management. pp. 224- 231 ,(2003) , 10.1145/956863.956907
Ion Muslea, Steve Minton, Craig Knoblock, A hierarchical approach to wrapper induction adaptive agents and multi-agents systems. pp. 190- 197 ,(1999) , 10.1145/301136.301191
Chia-Hui Chang, Shao-Chen Lui, IEPAD Proceedings of the tenth international conference on World Wide Web - WWW '01. pp. 681- 688 ,(2001) , 10.1145/371920.372182