Reducing computational complexity and response latency through the detection of contentless frames

作者： R.A. Sukkar , S.M. Herman , A.R. Setlur , C.D. Mitchell

DOI: 10.1109/ICASSP.2000.860218

关键词: Vocabulary 、 Vector quantization 、 Latency (engineering) 、 Speech recognition 、 Speech coding 、 Decoding methods 、 Computational complexity theory 、 Classifier (UML) 、 Computer science 、 Silence

摘要: In this paper, we present a method that manipulates the decoding network to reduce both computational complexity and response latency while maintaining high ASR accuracy. The employs TSVQ (tree structured vector quantization) classifier reliably discriminates between silence non-silence frames. Reductions in are achieved through three techniques: 1) skipping, 2) silence-based pruning of dynamic programming network, 3) early decision. Experimental results on connected digit task large vocabulary company name show proposed can by more than 82%. Furthermore, complexity, measured CPU seconds, was reduced 13.6% 6.7% recognition accuracy baseline system.

uni-trier.de 本地加速

sci-hub.se PDF 下载加速

参考文章(7)

Stefan Ortmanns, Wu Chou, Wolfgang Reichl, An efficient decoding method for real time speech recognition. conference of the international speech communication association. ,(1999)

Biing-Hwang Juang, Wu Chou, C.-E. Lee, Minimum error rate training of inter-word context dependent acoustic model units in speech recognition. conference of the international speech communication association. ,(1994)

Anand R. Setlur, Rafid A. Sukkar, Recognition-based word counting for reliable barge-in and early endpoint detection in continuous speech recognition. conference of the international speech communication association. ,(1998)

E. Burhke, Wu Chou, Qiru Zhou, A wave decoder for continuous speech recognition Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96. ,vol. 4, pp. 2135- 2138 ,(1996) , 10.1109/ICSLP.1996.607225

S.M. Herman, R.A. Sukkar, Variable threshold vector quantization for reduced continuous density likelihood computation in speech recognition ieee automatic speech recognition and understanding workshop. pp. 331- 338 ,(1997) , 10.1109/ASRU.1997.659108

S. Ortmanns, A. Eiden, H. Ney, N. Coenen, Look-ahead techniques for fast beam search international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1783- 1786 ,(1997) , 10.1109/ICASSP.1997.598876

E. Bocchieri, Vector quantization for the efficient computation of continuous density likelihoods IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 2, pp. 692- 695 ,(1993) , 10.1109/ICASSP.1993.319405

Reducing computational complexity and response latency through the detection of contentless frames

来源期刊

我的账户

Reducing computational complexity and response latency through the detection of contentless frames

来源期刊

相似文章 8

我的账户