Visual Speech in Technology-Enhanced Learning

作者: Priya Dey

DOI:

关键词:

摘要: This thesis investigates the use of synthetic talking heads, with lip, tongue and face movements synchronized synthesized or natural speech, in technology-enhanced learning. work applies heads a speech tutoring application for teaching English as second language. Previous studies have shown that perception is aided by visual information, but more research needed to determine effectiveness visualization articulators pronunciation training. explores whether not technology can give an improvement learning pronunciation. This techniques audiovisual synthesis, using both viseme-based data-driven approaches implement multiple heads. Intelligibility found be intelligible than audio alone, head was viseme-driven implementation. The are applied pronunciation-training application, which evaluated second-language learners investigate benefit User trials explored efficacy software demonstrating /b/–/p/ contrast English. The results indicate showed listening after software, while compared auditory training alone varied between individuals. evaluations were perceived helpful pronunciation, positive feedback on system suggests could useful addition traditional methods.

参考文章(78)
Katja Madany, Sascha Fagel, A 3-d virtual head as a tool for speech therapy for children. conference of the international speech communication association. pp. 2643- 2646 ,(2008)
Sascha Fagel, MASSY speaks English: adaptation and evaluation of a talking head. conference of the international speech communication association. pp. 2324- ,(2008)
Valérie Hazan, Talking heads and pronunciation training: a review. conference of the international speech communication association. pp. 2622- ,(2008)
G. Bailly, F. Elisei, M. Odisio, M. Bérar, Y. Pahan, M. Chabanas, Towards a generic talking head ,(2003)
Dominic W. Massaro, Joanna Light, Read My Tongue Movements: Bimodal Learning To Perceive And Produce Non-Native Speech /r/ and /l/ conference of the international speech communication association. ,(2003)
Pam Enderby, Mark S. Hawley, Phil D. Green, James Carmichael, Athanassios Hatzis, Mark Parker, Automatic speech recognition with sparse training data for dysarthric speakers. conference of the international speech communication association. ,(2003)
Pam Enderby, Rebecca Palmer, Stuart Cunningham, The Effect of Three Practice Conditions on the Consistency of Chronic Dysarthric Speech Journal of Medical Speech-language Pathology. ,vol. 12, pp. 183- ,(2004)
Frank H. Guenther, Neural control of speech movements ,(2002)
Dominic W. Massaro, Piero Cosi, Michael M. Cohen, BALDINI: BALDI SPEAKS ITALIAN! conference of the international speech communication association. ,(2002)
Christoph Bregler, Malcolm Slaney, Michele Covell, Video rewrite: visual speech synthesis from video. AVSP. pp. 153- 156 ,(1997)