作者: Manabu Torii , John Miller , K. Vijay-Shanker
DOI:
关键词: Natural language processing 、 Artificial intelligence 、 Domain (biology) 、 Hidden Markov model 、 Computer science 、 Part-of-speech tagging 、 Component (UML) 、 Lexicon 、 Speech recognition
摘要: Part of speech tagging is a fundamental component in many NLP systems. When taggers developed one domain are used another domain, the performance can degrade considerably. We present method for developing new domains without requiring POS annotated text domain. Our involves using raw and identifying related words to form specific lexicon. This lexicon provides initial lexical probabilities EM training an HMM model. evaluate by applying it Biology show that we achieve results comparable with some this