作者: Climent Nadeu , Pau Pachès-Leal , Biing-Hwang Juang
DOI: 10.1016/S0167-6393(97)00030-7
关键词:
摘要: Abstract In automatic speech recognition, the signal is usually represented by a set of time sequences spectral parameters (TSSPs) that model temporal evolution envelope frame-to-frame. Those are then filtered either to make them more robust environmental conditions or compute differential (dynamic features) which enhance discrimination. this paper, we apply frequency analysis TSSPs in order provide an interpretation framework for various types parameter filters used so far. Thus, average long-term spectrum successfully reveals combined effect equalization and band selection provides insights into TSSP filtering. Also, show paper that, when supplementary not used, recognition rate can be improved even clean speech, just properly filtering TSSPs. To support claim, number experimental results presented, both using whole-word subword based models. The empirically optimum attenuate low-pass emphasize higher peak output these lies at around syllable employed database (≈3 Hz).