作者: Alejandro Acero
DOI: 10.1007/978-1-4615-3122-7
关键词:
摘要: This dissertation describes a number of algorithms developed to increase the robustness automatic speech recognition systems with respect changes in environment. These attempt improve accuracy when they are trained and tested different acoustical environments, desk-top microphone (rather than close-talking microphone) is used for input. Without such processing, mismatches between training testing conditions produce an unacceptable degradation accuracy. Two kinds environmental variability introduced by use microphones conditions: additive noise spectral tilt linear filtering. An important attribute novel compensation described this thesis that provide joint rather independent these two types degradation. Acoustical applied our as correction cepstral domain. allows higher degree integration within SPHINX, Carnegie Mellon system, uses cepstrum its feature vector. Therefore, can be implemented very efficiently. Processing many based on instantaneous signal-to-noise ratio (SNR), appropriate represents form suppression at low SNRs equalization high SNRs. The vectors transformations estimated minimizing differences obtained from "standard" corpus represent current In work accomplished distortion vector-quantized cepstra produced extraction module SPHINX. In we describe several including SNR-Dependent Cepstral Normalization, (SDCN) Codeword-Dependent Normalization (CDCN). With CDCN, SPHINX recorded essentially same system microphone. An algorithm frequency normalization has also been proposed which parameter bilinear transformation signal-processing stage warping adjusted each new speaker The optimum value again chosen minimize vector-quantization standard environment one. preliminary studies, moderate additional decrease observed error rate.