A Viseme-based Approach to Labiometrics for Automatic Lipreading

作者: Lionel Revéret , Christian Benoît

DOI: 10.1007/BFB0016013

关键词:

摘要: There are two main approaches to the preprocessing of mouth images in automatic lipreading. The stochastic approach makes wide use learning techniques providing image features poorly interpretable. articulatory aims at measuring as accurately possible anatomical and/or geometrical parameters which can be interpreted phonetic terms. We call this “labiometrics”. Although method here proposed involves processing generally used approach, it is articulatory-oriented indeed; Not only gives some description a shape terms (i.e., visemes), but mostly reliable evaluation labiometric that could not automatically measured on natural lips without prior make-up. Moreover, component our based limited set training images, so its computation cost remains pretty low.

参考文章(32)
Eric David Petajan, Automatic lipreading to enhance speech recognition (speech reading) University of Illinois at Urbana-Champaign. ,(1984)
Thierry Guiard-Marigny, Ali Adjoudani, Christian Benoît, 3D Models of the Lips and Jaw for Visual Speech Synthesis Progress in Speech Synthesis. pp. 247- 258 ,(1997) , 10.1007/978-1-4612-1894-4_19
C. Benoît, T. Guiard-Marigny, B. Le Goff, A. Adjoudani, Which components of the face do humans and machines best speechread Springer Berlin Heidelberg. pp. 315- 328 ,(1996) , 10.1007/978-3-662-13015-5_24
Christoph Bregler, Stephen M. Omohundro, Jianbo Shi, Yochai Konig, Towards a Robust Speechreading Dialog System Springer, Berlin, Heidelberg. pp. 409- 423 ,(1996) , 10.1007/978-3-662-13015-5_31
Tomaso Poggio, Federico Girosi, A Theory of Networks for Approximation and Learning Massachusetts Institute of Technology. ,(1989)
Kenji Mase, Alex Pentland, Automatic lipreading by optical-flow analysis Systems and Computers in Japan. ,vol. 22, pp. 67- 76 ,(1991) , 10.1002/SCJ.4690220607
Michael M. Cohen, Dominic W. Massaro, Modeling Coarticulation in Synthetic Visual Speech Models and Techniques in Computer Animation. pp. 139- 156 ,(1993) , 10.1007/978-4-431-66911-1_13
Robert Kaucic, Barney Dalton, Andrew Blake, Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications european conference on computer vision. pp. 376- 387 ,(1996) , 10.1007/3-540-61123-1_154
Catherine Pelachaud, Norman I. Badler, Mark Steedman, Linguistic Issues in Facial Animation Computer Animation ’91. pp. 15- 30 ,(1991) , 10.1007/978-4-431-66890-9_2