An improved automatic lipreading system to enhance speech recognition

作者: E. Petajan , B. Bischoff , D. Bodoff , N. M. Brooke

DOI: 10.1145/57167.57170

关键词:

摘要: Current acoustic speech recognition technology performs well with very small vocabularies in noise or large low noise. Accurate over 100 words has yet to be achieved. Humans frequently lipread the visible facial articulations enhance recognition, especially when signal is degraded by hearing impairment. Automatic lipreading been found improve significantly and could advantageous noisy environments such as offices, aircraft factories.An improved version of a previously described automatic system developed which uses vector quantization, dynamic time warping, new heuristic distance measure. This paper presents visual results from multiple speakers under optimal conditions. Results combined are also presented show performance compared alone.

参考文章(5)
Eric David Petajan, Automatic lipreading to enhance speech recognition (speech reading) University of Illinois at Urbana-Champaign. ,(1984)
A. Gersho, V. Cuperman, Vector quantization: A pattern-matching technique for speech coding IEEE Communications Magazine. ,vol. 21, pp. 15- 21 ,(1983) , 10.1109/MCOM.1983.1091516
T. Huang, Coding of Two-Tone Images IEEE Transactions on Communications. ,vol. 25, pp. 1406- 1424 ,(1977) , 10.1109/TCOM.1977.1093775
H. Sakoe, S. Chiba, Dynamic programming algorithm optimization for spoken word recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 26, pp. 159- 165 ,(1978) , 10.1109/TASSP.1978.1163055