Speechreading using Probabilistic Models

DOI: 10.1006/CVIU.1996.0570

关键词: Image (mathematics) 、 Speech recognition 、 Computer science 、 Speechreading 、 Visibility (geometry) 、 Specularity 、 Hidden Markov model 、 Gaussian 、 Probabilistic logic 、 Tracking (particle physics)

摘要: We describe a robust method for locating and tracking lips in gray-level image sequences. Our approach learns patterns of shape variability from training set which constrains the model during search to only deform ways similar examples. Image is guided by learned used large appearance lips. Such might be due different individuals, illumination, mouth opening, specularity, or visibility teeth tongue. Visual speech features are recovered results represent both intensity information. speechreading (lip-reading) system, where extracted modeled Gaussian distributions their temporal dependencies hidden Markov models. Experimental presented lips, speechreading. The database consists broad variety speakers was recorded natural environment with no special lighting lip markers used. For speaker independent digit recognition task using visual information only, system achieved an accuracy about equivalent that untrained humans.

参考文章(61)

Eric David Petajan, Automatic lipreading to enhance speech recognition (speech reading) University of Illinois at Urbana-Champaign. ,(1984)

Quentin Summerfield, Audio-visual Speech Perception, Lipreading and Artificial Stimulation Hearing Science and Hearing Disorders. pp. 131- 182 ,(1983) , 10.1016/B978-0-12-460440-7.50010-7

Joseph S. Perkell, Physiology of Speech Production Phonosurgery. pp. 5- 21 ,(1989) , 10.1007/978-4-431-68358-2_2

Alex Waibel, Paul Duchnowski, Uwe Meier, See me, hear me: integrating automatic speech recognition and lip-reading. conference of the international speech communication association. ,(1994)

M. E. Lutman, M. P. Haggard, Hearing science and hearing disorders Academic Press. ,(1983)

C. Benoît, T. Guiard-Marigny, B. Le Goff, A. Adjoudani, Which components of the face do humans and machines best speechread Springer Berlin Heidelberg. pp. 315- 328 ,(1996) , 10.1007/978-3-662-13015-5_24

Chung-Lin Huang, Ching-Wen Chen, Human facial feature extraction for face interpretation and recognition international conference on pattern recognition. ,vol. 25, pp. 1435- 1444 ,(1992) , 10.1016/0031-3203(92)90118-3

Steve W. Beet, Neil A. Thacker, Juergen Luettin, Statistical LIP modelling for visual speech recognition european signal processing conference. pp. 1- 4 ,(1996) , 10.5281/ZENODO.36365

Kenji Mase, Alex Pentland, Automatic lipreading by optical-flow analysis Systems and Computers in Japan. ,vol. 22, pp. 67- 76 ,(1991) , 10.1002/SCJ.4690220607

10.

Tarcisio Coianiz, Lorenzo Torresani, Bruno Caprile, 2D Deformable Models for Visual Speech Analysis Springer, Berlin, Heidelberg. pp. 391- 398 ,(1996) , 10.1007/978-3-662-13015-5_29

Speechreading using Probabilistic Models

来源期刊

我的账户

Speechreading using Probabilistic Models

来源期刊

相似文章 10

我的账户