Combining ANNs to improve phone recognition

作者: B. Mak

DOI: 10.1109/ICASSP.1997.595487

关键词:

摘要: In applying neural networks to speech recognition, one often finds that slightly different training configurations lead significantly networks. Thus sessions using setups will likely end up in "mixed" network representing solutions regions of the data space. This sensitivity initial weights assigned, parameters and can be used enhance performance, a committee We study various ways combine context-dependent (CD) context-independent phone estimators improve recognition. As result, we obtain 6.3% 2.2% increase accuracy recognition monophones biphones respectively.

参考文章(12)
David H. Wolpert, Original Contribution: Stacked generalization Neural Networks. ,vol. 5, pp. 241- 259 ,(1992) , 10.1016/S0893-6080(05)80023-1
Yeshwant K. Muthusamy, Ronald A. Cole, Beatrice T. Oshika, The OGI multi-language telephone speech corpus. conference of the international speech communication association. ,(1992)
Mike Noel, Terri Lander, Ronald A. Cole, T. Durham, New telephone speech corpora at CSLU. conference of the international speech communication association. ,(1995)
D. Burshtein, Robust parametric modeling of durations in hidden Markov models international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 548- 551 ,(1995) , 10.1109/ICASSP.1995.479656
Etienne Barnard, Ronald Cole, Mark Fanty, Pieter J. E. Vermeulen, Real-world speech recognition with neural networks Applications and Science of Artificial Neural Networks. ,vol. 2492, pp. 524- 537 ,(1995) , 10.1117/12.205157
Bambang Parmanto, Howard R. Doyle, Paul W. Munro, Improving Committee Diagnosis with Resampling Techniques neural information processing systems. ,vol. 8, pp. 882- 888 ,(1995)
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K.J. Lang, Phoneme recognition using time-delay neural networks IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 37, pp. 393- 404 ,(1989) , 10.1109/29.21701
L.K. Hansen, P. Salamon, Neural network ensembles IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 12, pp. 993- 1001 ,(1990) , 10.1109/34.58871
Tony Robinson, Frank Fallside, A recurrent error propagation network speech recognition system Computer Speech & Language. ,vol. 5, pp. 259- 274 ,(1991) , 10.1016/0885-2308(91)90010-N
B. Mak, E. Barnard, Phone clustering using the Bhattacharyya distance international conference on spoken language processing. ,vol. 4, pp. 2005- 2008 ,(1996) , 10.1109/ICSLP.1996.607191