A Comparison of Human and Machine Estimation of Speaker Age

作者: Mark Huckvale , Aimee Webb

DOI: 10.1007/978-3-319-25789-1_11

关键词:

摘要: The estimation of the age a speaker from his or her voice has both forensic and commercial applications. Previous studies have shown that human listeners are able to estimate within 10 years on average, while recent machine systems seem show superior performance with average errors as low 6 years. However used highly non-uniform test sets, for which knowledge distribution offers considerable advantage system. In this study we compare same data chosen be uniformly distributed in age. We case accuracy is more similar 9.8 8.6 respectively, although if panels consulted, can improved value closer 7.5 Both machines difficulty accurately predicting ages older speakers.

参考文章(21)
Luís Torgo, Rita P. Ribeiro, Bernhard Pfahringer, Paula Branco, SMOTE for Regression portuguese conference on artificial intelligence. pp. 378- 389 ,(2013) , 10.1007/978-3-642-40669-0_33
Miguel Sales Dias, Thomas Pellegrini, Annika Hämäläinen, Vahid Hedayati, Isabel Trancoso, Speaker age estimation for elderly speech recognition in European Portuguese conference of the international speech communication association. pp. 2962- 2966 ,(2014)
Alex J. Smola, Bernhard Schölkopf, A tutorial on support vector regression Statistics and Computing. ,vol. 14, pp. 199- 222 ,(2004) , 10.1023/B:STCO.0000035301.49549.88
Evelyne Moyse, Aline Beaufort, Serge Brédart, Evidence for an own-age bias in age estimation from voices in older persons. European Journal of Ageing. ,vol. 11, pp. 241- 247 ,(2014) , 10.1007/S10433-014-0305-0
Mohamad Hasan Bahari, Hugo Van hamme, Speaker age estimation using Hidden Markov Model weight supervectors information sciences signal processing and their applications. pp. 517- 521 ,(2012) , 10.1109/ISSPA.2012.6310606
Ming Li, Kyu J Han, Shrikanth Narayanan, None, Automatic speaker age and gender recognition using acoustic and prosodic level information fusion Computer Speech & Language. ,vol. 27, pp. 151- 167 ,(2013) , 10.1016/J.CSL.2012.01.008
Mohamad Hasan Bahari, Mitchell McLaren, Hugo Van hamme, David A. van Leeuwen, Speaker age estimation using i-vectors Engineering Applications of Artificial Intelligence. ,vol. 34, pp. 99- 108 ,(2014) , 10.1016/J.ENGAPPAI.2014.05.003
Robert M Krauss, Robin Freyberg, Ezequiel Morsella, None, Inferring speakers’ physical attributes from their voices Journal of Experimental Social Psychology. ,vol. 38, pp. 618- 625 ,(2002) , 10.1016/S0022-1031(02)00510-3
Florian Eyben, Felix Weninger, Florian Gross, Björn Schuller, Recent developments in openSMILE, the munich open-source multimedia feature extractor Proceedings of the 21st ACM international conference on Multimedia - MM '13. pp. 835- 838 ,(2013) , 10.1145/2502081.2502224