Impact of Age in ASR for the Elderly: Preliminary Experiments in European Portuguese

作者: Thomas Pellegrini , Isabel Trancoso , Annika Hämäläinen , António Calado , Miguel Sales Dias

DOI: 10.1007/978-3-642-35292-8_15

关键词:

摘要: Standard automatic speech recognition (ASR) systems use acoustic models typically trained with of young adult speakers. Ageing is known to alter production in ways that require ASR be adapted, particular at the level modeling. This paper reports experiments illustrate impact speaker age on performance. A large read corpus European Portuguese allowed us measure statistically significant performance differences among groups ranging from 60- 90-year-old An increase 41% relative (11.9% absolute) word error rate was observed between 60-65-year-old and 81-86-year-old also retraining (AMs), further illustrating ageing Differentiated gains were depending range adaptation data retrain models.

参考文章(12)
João Paulo Neto, Alberto Abad, Incorporating acoustical modelling of phone transitions in an hybrid ANN/HMM speech recognizer conference of the international speech communication association. pp. 2394- 2397 ,(2008)
Hugo Meinedo, Diamantino Caseiro, João Neto, Isabel Trancoso, AUDIMUS.MEDIA: A Broadcast News Speech Recognition System for the European Portuguese Language Lecture Notes in Computer Science. pp. 9- 17 ,(2003) , 10.1007/3-540-45011-4_2
Aldebaro Klautau, Renata Vieira, Thiago Alexandre Salgueiro Pardo, Vera Lúcia Strube de Lima, António Branco, Computational Processing of the Portuguese Language ,(2011)
Ravichander Vipperla, Steve Renals, Joe Frankel, Longitudinal study of ASR performance on ageing Voices conference of the international speech communication association. pp. 2550- 2553 ,(2008)
K. Warner Schaie, Becca Levy, Bob G. Knight, Denise C. Park, Sherry L. Willis, Handbook of the Psychology of Aging ,(1979)
Akira Baba, Shinichi Yoshizawa, Miichi Yamada, Akinobu Lee, Kiyohiro Shikano, Acoustic models of the elderly for large‐vocabulary continuous speech recognition Electronics and Communications in Japan Part Ii-electronics. ,vol. 87, pp. 49- 57 ,(2004) , 10.1002/ECJB.20101
J.G. Wilpon, C.N. Jacobsen, A study of speech recognition for children and the elderly international conference on acoustics speech and signal processing. ,vol. 1, pp. 349- 352 ,(1996) , 10.1109/ICASSP.1996.541104
J. Neto, H. Meinedo, M. Viveiros, R. Cassaca, C. Martins, D. Caseiro, Broadcast news subtitling system in Portuguese international conference on acoustics, speech, and signal processing. pp. 1561- 1564 ,(2008) , 10.1109/ICASSP.2008.4517921
S. Anderson, N. Liberman, E. Bernstein, S. Foster, E. Cate, B. Levin, R. Hudson, Recognition of elderly speech and voice-driven document retrieval international conference on acoustics speech and signal processing. ,vol. 1, pp. 145- 148 ,(1999) , 10.1109/ICASSP.1999.758083
Hugo Meinedo, Thomas Pellegrini, Alberto Abad, Inesc-Id Lisboa, Isabel Trancoso, The L2F Broadcast News Speech Recognition System ,(2010)