A New Data Selection Approach for Semi-Supervised Acoustic Modeling

作者: Rong Zhang , A.I. Rudnicky

DOI: 10.1109/ICASSP.2006.1660047

关键词:

摘要: Current approaches to semi-supervised incremental learning prefer select unlabeled examples predicted with high confidence for model re-training. However, this strategy can degrade the classification performance rather than improve it. We present an analysis reasons of phenomenon, showing that only relying on data selection lead erroneous estimate true distribution when annotator is highly correlated classifier in information they use. propose a new approach address problem and apply it variety applications, including machine speech recognition. Encouraging improvements recognition accuracy are observed our experiments.

参考文章(11)
Fabio Gagliardi Cozman, Marcelo César Cirelo, Ira Cohen, Semi-Supervised Learning of Mixture Models and Bayesian Networks ,(2003)
Alexander I. Rudnicky, David Huggins-Daines, Ziad Al Bawab, Arthur Chan, Rong Zhang, Ananlada Chotimongkol, Investigations on Ensemble Based Semi-Supervised Acoustic Model Training conference of the international speech communication association. pp. 1677- 1680 ,(2005)
Alex Waibel, Thomas Kemp, Unsupervised training of a speech recognizer: recent experiments. conference of the international speech communication association. ,(1999)
A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, C. Wooters, The ICSI Meeting Corpus international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 364- 367 ,(2003) , 10.1109/ICASSP.2003.1198793
Kamal Nigam, Rayid Ghani, Analyzing the effectiveness and applicability of co-training Proceedings of the ninth international conference on Information and knowledge management - CIKM '00. pp. 86- 93 ,(2000) , 10.1145/354756.354805
Avrim Blum, Tom Mitchell, None, Combining labeled and unlabeled data with co-training conference on learning theory. pp. 92- 100 ,(1998) , 10.1145/279943.279962
Kamal Nigam, Andrew Kachites McCallum, Sebastian Thrun, Tom Mitchell, Text Classification from Labeled and Unlabeled Documents using EM Machine Learning. ,vol. 39, pp. 103- 134 ,(2000) , 10.1023/A:1007692713085
A. Guerrero-Curieses, J. Cid-Sueiro, An entropy minimization principle for semi-supervised terrain classification international conference on image processing. ,vol. 3, pp. 312- 315 ,(2000) , 10.1109/ICIP.2000.899370
Yves Grandvalet, Yoshua Bengio, Semi-supervised Learning by Entropy Minimization neural information processing systems. ,vol. 17, pp. 529- 536 ,(2004)