作者: R.A. Sukkar , S.M. Herman , A.R. Setlur , C.D. Mitchell
DOI: 10.1109/ICASSP.2000.860218
关键词: Vocabulary 、 Vector quantization 、 Latency (engineering) 、 Speech recognition 、 Speech coding 、 Decoding methods 、 Computational complexity theory 、 Classifier (UML) 、 Computer science 、 Silence
摘要: In this paper, we present a method that manipulates the decoding network to reduce both computational complexity and response latency while maintaining high ASR accuracy. The employs TSVQ (tree structured vector quantization) classifier reliably discriminates between silence non-silence frames. Reductions in are achieved through three techniques: 1) skipping, 2) silence-based pruning of dynamic programming network, 3) early decision. Experimental results on connected digit task large vocabulary company name show proposed can by more than 82%. Furthermore, complexity, measured CPU seconds, was reduced 13.6% 6.7% recognition accuracy baseline system.