An approach to statistical lip modelling for speaker identification via chromatic feature extraction

作者: T. Wark , S. Sridharan , V. Chandran

DOI: 10.1109/ICPR.1998.711095

关键词:

摘要: This paper presents a novel technique for the tracking of moving lips purpose speaker identification. In our system, model lip contour is formed directly from chromatic information in region. Iterative refinement point estimates not required. Colour features are extracted via concatenated profiles taken around contour. Reduction order obtained principal component analysis (PCA) followed by linear discriminant (LDA). Statistical models built based on Gaussian mixture (GMM). Identification experiments performed M2VTS/sup 1/ database, show encouraging results.

参考文章(7)
Shinobu Takamatsu, Akio Ogihara, Satoru Igawa, Akira Shintani, Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Frame Color Image (Special Section of Letters Selected from the 1996 IEICE General Conference) IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences. ,vol. 79, pp. 1836- 1840 ,(1996)
Tarcisio Coianiz, Lorenzo Torresani, Bruno Caprile, 2D Deformable Models for Visual Speech Analysis Springer, Berlin, Heidelberg. pp. 391- 398 ,(1996) , 10.1007/978-3-662-13015-5_29
M. U. Ramos Sánchez, J. Matas, J. Kittler, Statistical Chromaticity Models for Lip Tracking with B-splines AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication. pp. 69- 76 ,(1997) , 10.1007/BFB0015981
J. Luettin, N.A. Thacker, S.W. Beet, Locating and tracking facial speech features international conference on pattern recognition. ,vol. 1, pp. 652- 656 ,(1996) , 10.1109/ICPR.1996.546105
Douglas A. Reynolds, Speaker identification and verification using Gaussian mixture speaker models Speech Communication. ,vol. 17, pp. 91- 108 ,(1995) , 10.1016/0167-6393(95)00009-D
T. Wark, S. Sridharan, A syntactic approach to automatic lip feature extraction for speaker identification international conference on acoustics speech and signal processing. ,vol. 6, pp. 3693- 3696 ,(1998) , 10.1109/ICASSP.1998.679685