Some Issues in the Automatic Classification of U.S. Patents Working Notes for the AAAI-98 Workshop on Learning for Text Categorization

作者: Leah S. Larkey

DOI:

关键词: Information retrievalStructure (mathematical logic)Factor (programming language)Classifier (linguistics)AutomationComputer scienceDocument Structure DescriptionBayesian probabilityText categorizationRepresentation (mathematics)

摘要: The classification of U.S. patents poses some special problems due to the enormous size corpus, and complex hierarchical structure system, patent documents. representation documents has not received a great deal previous attention, but we have found it be an important factor in our work. We are exploring ways use this relations among subclasses facilitate patents. Our approach is derive vector terms phrases from most parts represent each document. both k-nearest-neighbor classifiers Bayesian classifiers. classifier allows us document using query operators Inquery information retrieval system. can select closely related negative examples train more discriminating

参考文章(12)
Yang Y, An evaluation of statistical approaches to MEDLINE indexing. conference of american medical informatics association. pp. 358- 362 ,(1996)
Norbert Fuhr, Kostas Tzeras, Gerhard Knorz, Stephan Hartmann, Michael Schwantner, Gerhard Lustig, AIR/X - A rule-based multistage indexing system for Iarge subject fields. Intelligent Text and Image Handling. pp. 606- 623 ,(1991)
Jinxi Xu, John Broglio, Bruce Croft, The Design and Implementation of a Part of Speech Tagger for English University of Massachusetts. ,(1994)
James P. Callan, John Broglio, James Allan, Lisa Ballesteros, W. Bruce Croft, Jinxi Xu, Hongming Shu, INQUERY at TREC-5 text retrieval conference. pp. 119- 132 ,(1996)
Norbert Fuhr, Models for retrieval with probabilistic indexing Information Processing and Management. ,vol. 25, pp. 55- 72 ,(1989) , 10.1016/0306-4573(89)90091-5
James P Callan, W.Bruce Croft, John Broglio, TREC and TIPSTER experiments with INQUERY text retrieval conference. ,vol. 31, pp. 327- 343 ,(1995) , 10.1016/0306-4573(94)00050-D
Hwee Tou Ng, Wei Boon Goh, Kok Leong Low, Feature selection, perceptron learning, and a usability case study for text categorization international acm sigir conference on research and development in information retrieval. ,vol. 31, pp. 67- 73 ,(1997) , 10.1145/258525.258537
Leah S. Larkey, W. Bruce Croft, Combining classifiers in text categorization international acm sigir conference on research and development in information retrieval. pp. 289- 297 ,(1996) , 10.1145/243199.243276
Robert Krovetz, Viewing morphology as an inference process international acm sigir conference on research and development in information retrieval. pp. 191- 202 ,(1993) , 10.1145/160688.160718
David Hickam, William Hersh, Chris Buckley, T. J. Leone, OHSUMED: an interactive retrieval evaluation and new large test collection for research international acm sigir conference on research and development in information retrieval. pp. 192- 201 ,(1994) , 10.5555/188490.188557