作者: Juergen Luettin , Neil A. Thacker
关键词: Image (mathematics) 、 Speech recognition 、 Computer science 、 Speechreading 、 Visibility (geometry) 、 Specularity 、 Hidden Markov model 、 Gaussian 、 Probabilistic logic 、 Tracking (particle physics)
摘要: We describe a robust method for locating and tracking lips in gray-level image sequences. Our approach learns patterns of shape variability from training set which constrains the model during search to only deform ways similar examples. Image is guided by learned used large appearance lips. Such might be due different individuals, illumination, mouth opening, specularity, or visibility teeth tongue. Visual speech features are recovered results represent both intensity information. speechreading (lip-reading) system, where extracted modeled Gaussian distributions their temporal dependencies hidden Markov models. Experimental presented lips, speechreading. The database consists broad variety speakers was recorded natural environment with no special lighting lip markers used. For speaker independent digit recognition task using visual information only, system achieved an accuracy about equivalent that untrained humans.