System and method for using anchor text as training data for classifier-based search systems

作者: Harr Chen , Adwait Ratnaparkhi , Sonja Knoll , Hsiao-Wuen Hon

DOI:

关键词: Learning classifier systemAnchor textClassifier (UML)Training setArtificial intelligenceUser inputNatural language processingMachine learningComputer science

摘要: A computer implemented information retrieval system is provided. The includes a user input configured to receive query relative the corpus. machine learning classifier trained with first set of training data comprising anchor text at least some documents in processing unit adapted interact obtain search results using classifier. In aspects, also second data. method integrating new document into corpus for retrieving from two distinct types

参考文章(10)
Ion Muslea, Craig A. Knoblock, Steven Minton, Wrapper induction by hierarchical data analysis ,(2000)
Kristopher E. Nybakken, Brian L. Hazlehurst, Scott M. Burke, Intelligent query system for automatically indexing in a database and automatically categorizing users ,(1999)
Seán Slattery, Kamal Nigam, Andrew McCallum, Mark Craven, Dayne Freitag, Tom Mitchell, Dan DiPasquo, Learning to extract symbolic knowledge from the World Wide Web national conference on artificial intelligence. pp. 509- 516 ,(1998)
Marc A. Smith, Duncan L. Davenport, Jesper B. Lind, Eric D. Brill, Wensi Xi, Systems and methods that rank search results ,(2004)
Daniel Lulich, Farzin Guilak, Paul Rehfuss, Self-improving system and method for classifying pages on the world wide web ,(2003)