Automatic dynamic template tracking of inner lips based on CLNF

作者: Li Liu , Gang Feng , Denis Beautemps

DOI: 10.1109/ICASSP.2017.7953134

关键词: PixelComputer visionCued speechComputer scienceLuminanceApertureFeature extractionVisualizationArtificial intelligenceDiscrete cosine transform

摘要: In this paper, a novel automatic approach to extract the inner lips contour of speakers without using artifices is proposed. This method based on recent facial extraction model developed in computer vision, called Constrained Local Neural Field (CLNF), which provides 8 characteristic points (landmarks) defining contour. However, directly applied our visual data including Cued Speech (CS) data, CLNF failed about 50% cases. We propose Modified estimate original landmarks. A dynamic template first derivative smoothed luminance variation explored new model. gives precise estimation aperture for lips. It evaluated 4800 images three French speakers. The proposed corrects 95% errors and total RMSE one pixel (i.e. 0.05cm average) reached, instead four pixels CLNF.

参考文章(11)
Samir K. Bandyopadhyay, LIP CONTOUR DETECTION TECHNIQUES BASED ON FRONT VIEW OF FACE Journal of Global Research in Computer Sciences. ,vol. 2, pp. 43- 46 ,(2011)
Panikos Heracleous, Denis Beautemps, Noureddine Aboutabit, Cued Speech automatic recognition in normal-hearing and deaf subjects Speech Communication. ,vol. 52, pp. 504- 512 ,(2010) , 10.1016/J.SPECOM.2010.03.001
D. Cristinacce, T. F. Cootes, Feature Detection and Tracking with Constrained Local Models british machine vision conference. ,vol. 3, pp. 929- 938 ,(2006) , 10.5244/C.20.95
Gung Feng, Data smoothing by cubic spline filters IEEE Transactions on Signal Processing. ,vol. 46, pp. 2790- 2796 ,(1998) , 10.1109/78.720380
Jason M. Saragih, Simon Lucey, Jeffrey F. Cohn, Deformable Model Fitting by Regularized Landmark Mean-Shift International Journal of Computer Vision. ,vol. 91, pp. 200- 215 ,(2011) , 10.1007/S11263-010-0380-4
Sébastien Stillittano, Vincent Girondel, Alice Caplier, Lip contour segmentation and tracking compliant with lip-reading application constraints Machine Vision and Applications. ,vol. 24, pp. 1- 18 ,(2013) , 10.1007/S00138-012-0445-1
Tadas Baltrusaitis, Peter Robinson, Louis-Philippe Morency, Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild international conference on computer vision. pp. 354- 361 ,(2013) , 10.1109/ICCVW.2013.54
Evangelos Skodras, Nikolaos Fakotakis, An unconstrained method for lip detection in color images international conference on acoustics, speech, and signal processing. pp. 1013- 1016 ,(2011) , 10.1109/ICASSP.2011.5946578
Jia Pei, Active Appearance Model ,(2010)
Jian-Ming Zhang, Liang-Min Wang, De-Jiao Niu, Yong-Zhao Zhan, Research and implementation of a real time approach to lip detection in video sequences international conference on machine learning and cybernetics. ,vol. 5, pp. 2795- 2799 ,(2003) , 10.1109/ICMLC.2003.1260027