Lip feature extraction and reduction for HMM-based visual speech recognition systems

作者： S. Alizadeh , R. Boostani , V. Asadpour

DOI: 10.1109/ICOSP.2008.4697195

关键词:

摘要: Lipreading is a main part of audio-visual speech recognition systems which are mostly faced with redundancy extracted features. In this paper, new approach has been proposed to increase the lipreading performance by extraction discriminant way, first, faces detected; then, lip key points in four cubic curves characterize contours. Next, visual features from contours for each frame. To discriminate unit (word) others, that frames arranged feature vector. Moreover, differences frame k previous used construct more informative vectors. solve small sample size problem, direct linear analysis (D-LDA) employed reduce size. classify these transformed features, hidden Markov model (HMM) recognize units. The algorithm was applied on M2VTS database. Results show applying D-LDA reduction provides better classification accuracy compare employ HMM without reduction.

icm.edu.pl 本地加速

sci-hub.se PDF 下载加速

参考文章(12)

J. Andrew Bangham, Richard Harvey, Iain Matthews, Stephen Cox, Nonlinear scale decomposition based features for visual speech recognition european signal processing conference. pp. 1- 4 ,(1998) , 10.5281/ZENODO.36896

Lionel Revéret, Christian Benoît, A Viseme-based Approach to Labiometrics for Automatic Lipreading AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication. pp. 335- 342 ,(1997) , 10.1007/BFB0016013

Stéphane Pigeon, Luc Vandendorpe, The M2VTS Multimodal Face Database (Release 1.00) Lecture Notes in Computer Science. ,vol. 1206, pp. 403- 409 ,(1997) , 10.1007/BFB0016021

N. Eveno, A. Caplier, P.-Y. Coulon, New color transformation for lips segmentation multimedia signal processing. pp. 3- 8 ,(2001) , 10.1109/MMSP.2001.962702

Hua Yu, Jie Yang, A direct LDA algorithm for high-dimensional data — with application to face recognition Pattern Recognition. ,vol. 34, pp. 2067- 2070 ,(2001) , 10.1016/S0031-3203(00)00162-X

Juergen Luettin, Neil A. Thacker, Speechreading using Probabilistic Models Computer Vision and Image Understanding. ,vol. 65, pp. 163- 178 ,(1997) , 10.1006/CVIU.1996.0570

N. Eveno, A. Caplier, P.-Y. Coulon, Accurate and quasi-automatic lip tracking IEEE Transactions on Circuits and Systems for Video Technology. ,vol. 14, pp. 706- 715 ,(2004) , 10.1109/TCSVT.2004.826754

L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE. ,vol. 77, pp. 267- 296 ,(1989) , 10.1109/5.18626

P.L. Silsbee, A.C. Bovik, Computer lipreading for improved accuracy in automatic speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 4, pp. 337- 351 ,(1996) , 10.1109/89.536928

10.

Keinosuke Fukunaga, Introduction to Statistical Pattern Recognition ,(1972)

Lip feature extraction and reduction for HMM-based visual speech recognition systems

来源期刊

我的账户

Lip feature extraction and reduction for HMM-based visual speech recognition systems

来源期刊

相似文章 10

我的账户