Accurate and quasi-automatic lip tracking

作者: N. Eveno , A. Caplier , P.-Y. Coulon

DOI: 10.1109/TCSVT.2004.826754

关键词:

摘要: Lip segmentation is an essential stage in many multimedia systems such as videoconferencing, lip reading, or low-bit-rate coding communication systems. In this paper, we propose accurate and robust quasi-automatic algorithm. First, the upper mouth boundary several characteristic points are detected first frame by using a new kind of active contour: "jumping snake." Unlike classic snakes, it can be initialized far from final edge adjustment its parameters easy intuitive. Then, to achieve segmentation, parametric model composed cubic curves. Its high flexibility enables contour extraction even challenging case very asymmetric mouth. Compared existing models, brings significant improvement accuracy realism. The following frames achieved interframe tracking keypoints parameters. However, show that, with usual algorithm, keypoints' positions become unreliable after few frames. We therefore process that hundreds Finally, mean errors our algorithm comparable manual points' selection errors.

参考文章(20)
N. Eveno, A. Caplier, P.-Y. Coulon, A parametric model for realistic lip segmentation international conference on control, automation, robotics and vision. ,vol. 3, pp. 1426- 1431 ,(2002) , 10.1109/ICARCV.2002.1234982
Tarcisio Coianiz, Lorenzo Torresani, Bruno Caprile, 2D Deformable Models for Visual Speech Analysis Springer, Berlin, Heidelberg. pp. 391- 398 ,(1996) , 10.1007/978-3-662-13015-5_29
N. Eveno, A. Caplier, P.-Y. Coulon, Jumping snakes and parametric model for lip segmentation international conference on image processing. ,vol. 2, pp. 867- 870 ,(2003) , 10.1109/ICIP.2003.1246818
Juergen Luettin, Neil A. Thacker, Steve W. Beet, Active Shape Models for Visual Speech Feature Extraction Speechreading by Humans and Machines. ,vol. 150, pp. 383- 390 ,(1996) , 10.1007/978-3-662-13015-5_28
N. Eveno, A. Caplier, P.-Y. Coulon, New color transformation for lips segmentation multimedia signal processing. pp. 3- 8 ,(2001) , 10.1109/MMSP.2001.962702
Petar S Aleksic, Jay J Williams, Zhilin Wu, Aggelos K Katsaggelos, None, Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features EURASIP Journal on Advances in Signal Processing. ,vol. 2002, pp. 1213- 1227 ,(2002) , 10.1155/S1110865702206162
Ellen C Hildreth, Shimon Ullman, The Measurement of Visual Motion ,(1984)
A. Hurlbert, T. Poggio, Synthesizing a color algorithm from examples Science. ,vol. 239, pp. 482- 485 ,(1988) , 10.1126/SCIENCE.3340834
D. Terzopoulos, K. Waters, Analysis and synthesis of facial image sequences using physical and anatomical models IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 15, pp. 569- 579 ,(1993) , 10.1109/34.216726
P. Delmas, P.Y. Coulon, V. Fristot, Automatic snakes for robust lip boundaries extraction international conference on acoustics speech and signal processing. ,vol. 6, pp. 3069- 3072 ,(1999) , 10.1109/ICASSP.1999.757489