Improving Speech Recognition through Automatic Selection of Age Group – Specific Acoustic Models

作者: Annika Hämäläinen , Hugo Meinedo , Michael Tjalve , Thomas Pellegrini , Isabel Trancoso

DOI: 10.1007/978-3-319-09761-9_2

关键词:

摘要: The acoustic models used by automatic speech recognisers are usually trained with collected from young to middle-aged adults. As the characteristics of change age, such tend perform poorly on children's and elderly people's speech. In this study, we investigate whether age group classification speakers, together -specific models, could improve recognition performance. We train an classifier accuracy about 95% show that using results select for children leads considerable gains in performance, as compared adults' recognising their speech, well.

参考文章(34)
Annika Hämäläinen, Silvia Rodrigues, Ana Júdice, Sandra Morgado Silva, António Calado, Fernando Miguel Pinto, Miguel Sales Dias, None, The CNG Corpus of European Portuguese Children’s Speech text speech and dialogue. pp. 544- 551 ,(2013) , 10.1007/978-3-642-40585-3_68
Hugo Meinedo, João Paulo Neto, Ciro Martins, Luís B. Almeida, The design of a large vocabulary speech corpus for portuguese. conference of the international speech communication association. ,(1997)
Thomas Pellegrini, Isabel Trancoso, Annika Hämäläinen, António Calado, Miguel Sales Dias, Daniela Braga, Impact of Age in ASR for the Elderly: Preliminary Experiments in European Portuguese IberSPEECH. pp. 139- 147 ,(2012) , 10.1007/978-3-642-35292-8_15
Alessandro Vinciarelli, Elmar Nöth, Rob van Son, Björn W. Schuller, Stefan Steidl, Felix Burkhardt, Benjamin Weiss, Tobias Bocklet, Florian Eyben, Felix Weninger, Gelareh Mohammadi, Anton Batliner, The INTERSPEECH 2012 Speaker Trait Challenge conference of the international speech communication association. pp. 254- 257 ,(2012)
John C. Platt, Fast training of support vector machines using sequential minimal optimization Advances in kernel methods. pp. 185- 208 ,(1999)
Michael Wong, Daniel Elenius, Christian Hacker, Matteo Gerosa, Stefan Steidl, Martin J. Russell, Mats Blomberg, Diego Giuliani, Anton Batliner, Shona D'Arcy, The PF STAR Children's Speech Corpus conference of the international speech communication association. pp. 2761- 2764 ,(2005)
H. Van Hamme, C. Cucchiarini, F. Smits, O. van Herwijnen, JASMIN-CGN: Extension of the Spoken Dutch Corpus with Speech of Elderly People, Children and Non-natives in the Human-Machine Interaction Modality language resources and evaluation. pp. 135- 138 ,(2006)
Ravichander Vipperla, Steve Renals, Joe Frankel, Longitudinal study of ASR performance on ageing Voices conference of the international speech communication association. pp. 2550- 2553 ,(2008)
Shin-ya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta, Dialogue Experiment for Elderly People in Home Health Care System text speech and dialogue. pp. 418- 423 ,(2003) , 10.1007/978-3-540-39398-6_60
BSCH OLKOPF, C Burges, A Smola, Advances in kernel methods: support vector learning international conference on neural information processing. ,(1999) , 10.5555/299094