作者: T.W. Lewis , D.M.W. Powers
DOI: 10.1109/ISSPA.2005.1580196
关键词:
摘要: Auditory and visual signals provide complementary information but few applications successfully combine the two sources. We consider a distinctive feature approach to Audio Visual Automatic Speech Recognition (AV-ASR) in which features appropriate each modality are employed, demonstrate that absence of knowledge about noise modality-specific is best. However even from non-preferred can be usefully employed if environmental context (e.g. SNR) accounted for by adaptively weighting modality. Future research focusing on deriving these automatically data rather than using those proposed linguists.