作者: Yu Tsao , Chin-Hui Lee
DOI:
关键词: Reduction (complexity) 、 Speech recognition 、 Adaptation (computer science) 、 Gaussian 、 Computer science 、 Pattern recognition 、 Vector space 、 Interpolation 、 Hidden Markov model 、 Transformation (function) 、 Artificial intelligence 、 Approximation error
摘要: We propose a vector space approach to characterizing environments for robust speech recognition. represent given environment by super-vector formed concatenating all the mean vectors of Gaussian mixture components state observation densities hidden Markov models trained in particular environment. New super-vectors can now be obtained either an interpolation method with collection from many real or simulated transformation performed on anchor specific environment, such as clean condition. At 5dB signal-to-noise (SNR) level, both interpolation- and transformation-based approaches achieve significant error rate reduction close 47% baseline system cepstral subtraction (CMS) only two adaptation utterances. When incorporating N-best information perform unsupervised at SNR same utterances, we relative about 40%, that achieved supervised mode. Index Terms: acoustic modeling,