Towards Robust Cross-Domain Domain Adaptation for Part-of-Speech Tagging

作者: Hinrich Schütze , Tobias Schnabel

DOI: 10.18419/OPUS-3064

关键词:

摘要: Most systems in natural language processing experience a substantial loss performance when the data that system is tested with differs significantly from has been trained on. Systems for part-of-speech (POS) tagging, example, are typically on newspaper texts but often applied to of other domains such as medical texts. Domain adaptation (DA) techniques seek improve so they able achieve consistently good - independent at hand. We investigate robustness domain representations and methods across target using tagging case study. We find there no single representation method works equally well all domains. In particular, large differences between more similar source those less similar.

参考文章(64)
Sandra Kübler, Eric Baucom, None, Fast Domain Adaptation for Part of Speech Tagging for Dialogues recent advances in natural language processing. pp. 41- 48 ,(2011)
Gertjan van Noord, Valia Kordoni, Kostadin Cholakov, Yi Zhang, An Empirical Comparison of Unknown Word Prediction Methods international joint conference on natural language processing. pp. 767- 775 ,(2011)
Lluís Màrquez, Jesús Giménez, SVMTool: A general POS Tagger Generator Based on Support Vector Machines language resources and evaluation. ,(2004)
Zhongqiang Huang, Vladimir Eidelman, Mary Harper, Improving A Simple Bigram HMM Part-of-Speech Tagger by Latent Annotation and Self-Training north american chapter of the association for computational linguistics. pp. 213- 216 ,(2009) , 10.3115/1620853.1620911
Eugene Charniak, Statistical Techniques for Natural Language Parsing Ai Magazine. ,vol. 18, pp. 33- 44 ,(1997) , 10.1609/AIMAG.V18I4.1320
Tenko Raykov, George A. Marcoulides, An Introduction to Applied Multivariate Analysis ,(2008)
Hinrich Schuetze, Distributional Part-of-Speech Tagging arXiv: Computation and Language. ,(1995)
Alexander Yates, Fei Huang, Exploring Representation-Learning Approaches to Domain Adaptation Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing. pp. 23- 30 ,(2010)
Yee Seng Chan, Hwee Tou Ng, Word sense disambiguation with distribution estimation international joint conference on artificial intelligence. pp. 1010- 1015 ,(2005)
Daniel Jurafsky, Christopher D. Manning, Huihsin Tseng, Morphological features help POS tagging of unknown words across language varieties. Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing. ,(2005)