Audio-visual integration in multimodal communication

作者: Tsuhan Chen , R.R. Rao

DOI: 10.1109/5.664274

关键词:

摘要: We review recent research that examines audio-visual integration in multimodal communication. The topics include bimodality human speech, and automated lip reading, facial animation, synchronization, joint audio-video coding, bimodal speaker verification. also study the enabling technologies for these topics, including automatic facial-feature tracking audio-to-visual mapping. Recent progress shows processing of audio video provides advantages are not available when processed independently.

参考文章(61)
G. Wolberg, Digital Image Warping IEEE Computer Society Press. ,(1990)
Kerry P. Green, The Use of Auditory and Visual Information in Phonetic Perception Springer, Berlin, Heidelberg. pp. 55- 77 ,(1996) , 10.1007/978-3-662-13015-5_5
Claude C. Chibelushi, John S. Mason, R. Deravi, Integration of acoustic and visual speech for speaker recognition. conference of the international speech communication association. ,(1993)
Eric David Petajan, Automatic lipreading to enhance speech recognition (speech reading) University of Illinois at Urbana-Champaign. ,(1984)
Eric Cosatto, Gerasimos Potamianos, Hans Peter Graf, David B. Roe, Speaker independent audio-visual database for bimodal ASR. Proc. AVSP'97. pp. 65- 68 ,(1997)
Quentin Summerfield, Some preliminaries to a comprehensive account of audio-visual speech perception. Lawrence Erlbaum Associates, Inc. ,(1987)
Peter L. Silsbee, Alan C. Bovik, Medium Vocabulary Audiovisual Speech Recognition Springer Berlin Heidelberg. pp. 120- 123 ,(1995) , 10.1007/978-3-642-57745-1_21
Dominic W. Massaro, Bimodal Speech Perception: A Progress Report Springer Berlin Heidelberg. pp. 79- 101 ,(1996) , 10.1007/978-3-662-13015-5_6