作者: Eduardo Lleida , Antonio Miguel , Luis Buera , Alfonso Ortega , Richard C. Rose
DOI:
关键词:
摘要: This paper presents a decoding method for automatic speech recognition (ASR) that reduces the impact of local spectral and temporal variabilities on ASR performance. The procedure involves augmenting standard Viterbi search an optimum state sequence with locally constrained degrees warping or applied to individual analysis frames. It is argued in this represents efficient effective compensating variability which may have potential application broader array transformations. techniques are presented context existing methods frequency based speaker normalization computation dynamic features ASR. modified algorithms were evaluated both clean noisy task domains using subsets Aurora 2 3 Speech Corpora under conditions. was found that, conditions Spanish Language Subset Speech-DatCar database, transformations reduced word error rate (WER) by 24 percent. factor two greater reduction WER than obtained same more well known vocal tract length (VTLN) procedure.