Distinctive feature fusion for improved audio-visual phoneme recognition

作者: T.W. Lewis , D.M.W. Powers

DOI: 10.1109/ISSPA.2005.1580196

关键词:

摘要: Auditory and visual signals provide complementary information but few applications successfully combine the two sources. We consider a distinctive feature approach to Audio Visual Automatic Speech Recognition (AV-ASR) in which features appropriate each modality are employed, demonstrate that absence of knowledge about noise modality-specific is best. However even from non-preferred can be usefully employed if environmental context (e.g. SNR) accounted for by adaptively weighting modality. Future research focusing on deriving these automatically data rather than using those proposed linguists.

参考文章(14)
Partha Niyogi, Jialin Zhong, Eric Petajan, Feature based representation for audio-visual speech recognition. AVSP. pp. 16- ,(1999)
Quentin Summerfield, Some preliminaries to a comprehensive account of audio-visual speech perception. Lawrence Erlbaum Associates, Inc. ,(1987)
David M. W. Powers, Christopher C. R. Turk, Machine learning of natural language Springer-Verlag New York, Inc.. ,(1989) , 10.1007/978-1-4471-1697-4
David M. W. Powers, Trent W. Lewis, Audio-Visual Speech Recognition using Red Exclusion and Neural Networks. Journal of Research and Practice in Information Technology. ,vol. 35, pp. 41- 64 ,(2003)
Javier R. Movellan, Paul Mineiro, Robust Sensor Fusion: Analysis and Application to Audio Visual Speech Recognition Machine Learning. ,vol. 32, pp. 85- 100 ,(1998) , 10.1023/A:1007468413059
Ahmed M.A. Ali, Jan Van der Spiegel, Paul Mueller, An acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech international symposium on circuits and systems. ,vol. 3, pp. 118- 121 ,(1999) , 10.1109/ISCAS.1999.778799
George A. Miller, Patricia E. Nicely, An Analysis of Perceptual Confusions Among Some English Consonants The Journal of the Acoustical Society of America. ,vol. 27, pp. 338- 352 ,(1955) , 10.1121/1.1907526
Brian E. Walden, Robert A. Prosek, Allen A. Montgomery, Charlene K. Scherr, Carla J. Jones, Effects of Training on the Visual Recognition of Consonants Journal of Speech and Hearing Research. ,vol. 20, pp. 130- 145 ,(1977) , 10.1044/JSHR.2001.130