Learning data prototypes for information extraction

作者: Steven Minton , Kristina Lerman

DOI:

关键词:

摘要: A method for determining statistically significant token sequences lends itself use in the recognition of broken wrappers as well construction new wrapper rules. When rules are needed underlying wrapped data has changed, training examples used to recognized rule candidates that culled with a bias would be probably more successful. The resulting candidate set is clustered according feature characteristics, then compared examples. Those most similar create

参考文章(39)
Amod Prabhakar Bodas, David Allan Higgen, Peter Kingsley Craft, Tarun Kumar Tripathy, Clive Mathew Philbrick, Edward John Row, Murali Sundaramoorthy Repakula, Daniel Murray Jones, Paul Popelka, Leslie Thomas McCutcheon, Richard Allen Walter, Donald Wayne Sterk, Lakshman Narayanaswamy, Paul Brian Del Fante, Processing system with dynamically allocatable buffer memory ,(1997)
Daniel J. Dietterich, Scott H. Davis, Hsin H. Lee, Steven J. Frank, John B. Carter, Structured data storage using globally addressable memory ,(1997)
Jeanette L. Blomberg, Christian K. Shin, Randall H. Trigg, James V. Mahoney, System for searching a corpus of document images by user specified document layout components ,(1997)
Ion Muslea, Craig A. Knoblock, Steven Minton, Wrapper induction by hierarchical data analysis ,(2000)