Effect of Data Imbalance on Unsupervised Domain Adaptation of Part-of-Speech Tagging and Pivot Selection Strategies.

作者: Frans Coenen , Danushka Bollegala , Xia Cui

DOI:

关键词: Speech recognitionSelection (genetic algorithm)Data imbalanceDomain adaptationPart-of-speech taggingComputer science

摘要:

参考文章(17)
Hinrich Schütze, Tobias Schnabel, Towards Robust Cross-Domain Domain Adaptation for Part-of-Speech Tagging international joint conference on natural language processing. pp. 198- 206 ,(2013) , 10.18419/OPUS-3064
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Zhaohui Zheng, Xiaoyun Wu, Rohini Srihari, Feature selection for text categorization on imbalanced data ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 80- 89 ,(2004) , 10.1145/1007730.1007741
Hongyu Guo, Herna L. Viktor, Learning from imbalanced data sets with boosting and data generation ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 30- 39 ,(2004) , 10.1145/1007730.1007736
Hal Daumé Iii, Avishek Saha, Abhishek Kumar, Frustratingly Easy Semi-Supervised Domain Adaptation Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing. pp. 53- 59 ,(2010)
Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman Vaughan, A theory of learning from different domains Machine Learning. ,vol. 79, pp. 151- 175 ,(2010) , 10.1007/S10994-009-5152-4
Ryan McDonald, Slav Petrov, Overview of the 2012 Shared Task on Parsing the Web ,(2012)
ChengXiang Zhai, Jing Jiang, Instance Weighting for Domain Adaptation in NLP meeting of the association for computational linguistics. pp. 264- 271 ,(2007)
Haibo He, E.A. Garcia, Learning from Imbalanced Data IEEE Transactions on Knowledge and Data Engineering. ,vol. 21, pp. 1263- 1284 ,(2009) , 10.1109/TKDE.2008.239
Hal Daume Iii, Frustratingly Easy Domain Adaptation meeting of the association for computational linguistics. pp. 256- 263 ,(2007)