The INTERSPEECH 2010 Paralinguistic Challenge

作者: Laurence Devillers , Björn W. Schuller , Stefan Steidl , Felix Burkhardt , Shrikanth S. Narayanan

DOI:

关键词: Computer scienceArtificial intelligenceParalanguageSpeech recognitionNatural language processing

摘要: Abstract Most paralinguistic analysis tasks are lacking agreed-uponevaluation procedures and comparability, in contrast to more‘traditional’ disciplines speech analysis. The INTERSPEECH2010 Paralinguistic Challenge shall help overcome the usuallylow compatibility of results, by addressing three selected sub-challenges. In Age Sub-Challenge, age speakers hasto be determined four groups. Gender Sub-Challenge,a three-class classification task has solved finally, theAffect Sub-Challenge asks for speakers’ interest ordinal rep-resentation. This paper introduces conditions, Challengecorpora “aGender” “TUM AVIC” standard feature setsthat may used. Further, baseline results given.Index Terms: Challenge, Age, Gender, Affect 1. Introduction resemble each other not onlyby means processing ever-present data sparseness, but bylacking agreed-upon evaluation comparability,in more traditional Atthe same time, this is a rapidly emerging field research, dueto constantly growing on applications fieldsof Human-Machine Communication, Human-Robot Communi-cation, Multimedia Retrieval. these respects, INTER-SPEECH 2010 bridging thegap between excellent research information inspoken language low address-ing tasks. AVIC”corpora provided organizers. first consists 46hours telephone speech, stemming from 954 speakers, andserves evaluate features algorithms detection ofspeaker gender. second 2 hours humanconversational recording (21 subjects), annotated 5differentlevelsofinterest. Thecorpusfurtherfeaturesauniquelydetailed transcription spoken content with word boundaries byforced alignment, non-linguistic vocalizations, single annotatortracks, sequence (sub-)speaker-turns. Both given

参考文章(9)
Felix Burkhardt, Joachim Stegmann, Wiebke Johannsen, Martin Eckert, A Database of Age and Gender Annotated Telephone Speech language resources and evaluation. ,(2010)
Björn W. Schuller, Stefan Steidl, Anton Batliner, The INTERSPEECH 2009 Emotion Challenge conference of the international speech communication association. pp. 312- 315 ,(2009)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Björn Schuller, Ronald Müller, Florian Eyben, Jürgen Gast, Benedikt Hörnler, Martin Wöllmer, Gerhard Rigoll, Anja Höthker, Hitoshi Konosu, Being bored? Recognising natural interest by extensive audiovisual integration for real-life application Image and Vision Computing. ,vol. 27, pp. 1760- 1774 ,(2009) , 10.1016/J.IMAVIS.2009.02.013
Stefan Steidl, Anton Batliner, Dino Seppi, Björn Schuller, On the impact of children's emotional speech on acoustic and language models Eurasip Journal on Audio, Speech, and Music Processing. ,vol. 2010, pp. 783954- ,(2010) , 10.1155/2010/783954
Michael Grimm, Kristian Kroschel, Shrikanth Narayanan, The Vera am Mittag German audio-visual emotional speech database international conference on multimedia and expo. pp. 865- 868 ,(2008) , 10.1109/ICME.2008.4607572
Laurence Vidrascu, Laurence Devillers, Real-life emotions detection with lexical and paralinguistic cues on Human-Human call center dialogs conference of the international speech communication association. ,(2006)
Florian Eyben, Martin Wollmer, Bjorn Schuller, OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit affective computing and intelligent interaction. pp. 1- 6 ,(2009) , 10.1109/ACII.2009.5349350
Georg Stemmer, Elmar Nöth, Vijay Parsa, Atypical Speech EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech archive. ,vol. 2010, ,(2010) , 10.1155/2010/835974