Large-vocabulary isolated word recognition with fast coarse time alignment

作者: A. Aktas , B. Kammerer , W. Kupper , H. Lagger

DOI: 10.1109/ICASSP.1986.1169201

关键词:

摘要: An isolated word recognition system for large vocabularies (1000 words and up) with an average rate of more than 98 per cent is presented. Each utterance characterized by a sequence feature vectors which are obtained autocorrelation analysis. The resulting coefficients quantized in such way, that entire vector can be stored single data word. A distance measure adapted to this representation has been developed. classification performed hierarchically two steps. In the preselection stage, divided into 16 segments hardware employed coarse nonlinear mapping. short ranked list candidates processed following final classifier performs time alignment fully resolved patterns using Dynamic Programming. Thus response high performance achieved. Without full use parallelism overall vocabulary less one second on signal processor.

参考文章(7)
K. N. Stevens, Autocorrelation Analysis of Speech Sounds Journal of the Acoustical Society of America. ,vol. 22, pp. 677- 677 ,(1950) , 10.1121/1.1906687
T. K. Vintsyuk, Speech discrimination by dynamic programming Cybernetics. ,vol. 4, pp. 52- 57 ,(1972) , 10.1007/BF01074755
H. Lagger, A. Waibel, A coarse phonetic knowledge source for template independent large vocabulary word recognition ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing. ,vol. 10, pp. 862- 865 ,(1985) , 10.1109/ICASSP.1985.1168314
B. Kammerer, W. Kupper, H. Lagger, Special feature vector coding and appropriate distance definition developed for a speech recognition system international conference on acoustics, speech, and signal processing. ,vol. 9, pp. 13- 16 ,(1984) , 10.1109/ICASSP.1984.1172564
T. Kaneko, N. Dixon, A hierarchical decision approach to large-vocabulary discrete utterance recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 31, pp. 1061- 1066 ,(1983) , 10.1109/TASSP.1983.1164211
A. Buzo, A. Gray, R. Gray, J. Markel, Speech coding based upon vector quantization IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 28, pp. 562- 574 ,(1980) , 10.1109/TASSP.1980.1163445