作者: M. Kubanek , J. Bobulski , L. Adrjanowicz
DOI: 10.2478/V10175-012-0041-6
关键词:
摘要: This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of the highly disturbed audio signal. Recognition was based combined hidden Markov models (CHMM). The described methods were developed a single isolated command, nevertheless their effectiveness indicated that they would also work similarly continuous audiovisual recognition. problem visual analysis is very difficult and computationally demanding, mostly because an extreme amount data needs to be processed. Therefore, method audio-video used only while audiospeech signal exposed considerable level distortion. There are proposed authors’ own lip edges detection characteristic extraction this paper. Moreover, fusing characteristics tested. A significant increase processing speed noted during tests – properly selected CHMM parameters adequate codebook size, besides use appropriate fusion characteristics. experimental results promising close those achieved by leading scientists field