Toward the Automatic Generation of Cued Speech

作者: Maroula S. Bratakos , Louis D. Braida , Paul Duchnowski

DOI:

关键词: Speech recognitionProcess (computing)Cue-dependent forgettingPsychologySpeechreadingSpeech receptionCued speech

摘要: Although Manual Cued Speech (MCS) can greatly facilitate both education and communication for the deaf, its use is limited to situations in which talker, or a transliterator, able produce cues cue receiver. The availability of automatically produced would substantially relax this restriction. However, it unclear whether current automatic speech recognition (ASR) technology be adequate producing automatically. To evaluate adequacy, we measured reception scores achieved by highly experienced receivers MCS when were one actual several hypothesized ASR systems, as well with speechreading alone MCS. systems studied modelled effects various types errors delays associated process, included representative state-of-the-art, speaker-dependent phonetic recognizer. Results indicate that while speaker-independent probably not provide useful cues, provided aid substantially. benefit generated heavily dependent on an effective visual display minimizes so are perceived synchrony facial actions.

参考文章(30)
Daniel Ling, Bryan R. Clarke, The Effects of Using Cued Speech: A Follow-Up Study. Volta Review. ,(1976)
Daniel Ling, Bryan R. Clarke, Cued speech: an evaluative study. American Annals of the Deaf. ,vol. 120, pp. 480- 488 ,(1975)
Victor W. Zue, Stephanie Seneff, Transcription and Alignment of the TIMIT Database Recent Research Towards Advanced Man-Machine Interface Through Spoken Language. pp. 515- 525 ,(1996) , 10.1016/B978-044481607-8/50088-8
Louis D. Braida, Paul Duchnowski, A new structure for automatic speech recognition Ph.D. Thesis. ,(1993)
R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, J. Makhoul, Context-dependent modeling for acoustic-phonetic recognition of continuous speech international conference on acoustics, speech, and signal processing. ,vol. 10, pp. 1205- 1208 ,(1985) , 10.1109/ICASSP.1985.1168283
Lalit R. Bahl, Frederick Jelinek, Robert L. Mercer, A Maximum Likelihood Approach to Continuous Speech Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. PAMI-5, pp. 179- 190 ,(1983) , 10.1109/TPAMI.1983.4767370
Allen A. Montgomery, Pamela L. Jackson, Physical characteristics of the lips underlying vowel lipreading performance Journal of the Acoustical Society of America. ,vol. 73, pp. 2134- 2144 ,(1983) , 10.1121/1.389537
Georges Vilaclara, Speech processing to aid the profoundly deaf Journal of the Acoustical Society of America. ,vol. 84, ,(1988) , 10.1121/1.2026308
Arthur Boothroyd, Liat Kishon-Rabin, Robin Waldstein, Studies of Tactile Speechreading Enhancement in Deaf Adults Seminars in Hearing. ,vol. 16, pp. 328- 340 ,(1995) , 10.1055/S-0028-1083730
Arthur Boothroyd, Susan Nittrouer, Mathematical treatment of context effects in phoneme and word recognition Journal of the Acoustical Society of America. ,vol. 84, pp. 101- 114 ,(1988) , 10.1121/1.396976