Applications in Intelligent Speech Analysis

作者: Björn Schuller

DOI: 10.1007/978-3-642-36806-6_10

关键词:

摘要: Speech is broadly considered as being the most natural communication form for humans. Obviously, there are manifold applications opening up general technical and computer systems, once they able to recognise speech well humans do—be it interaction purposes with humans, mediation between or retrieval. Here, state-of-the-art methodology presented highly robust recognition, nonlinguistic vocalisation paralinguistic speaker states traits exemplified by sentiment, emotion, interest, age, gender, intoxication sleepiness. All examples stem from author’s recent work. In particular latter chosen a series of Challenges co-organised author at Interspeech 2009 onwards.

参考文章(202)
Nataliya Romanyshyn, Paralinguistic maintenance of verbal communicative interaction in literary discourse (on the material of W. S. Maugham's novel “Theatre”) international conference on experience of designing and applications of cad systems in microelectronics. pp. 550- 552 ,(2009)
Martin Wöllmer, Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi, Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario ACM Transactions on Speech and Language Processing. ,vol. 7, pp. 12- ,(2011) , 10.1145/1998384.1998386
A. Maier, T. Haderlein, U. Eysholdt, F. Rosanowski, A. Batliner, M. Schuster, E. Nöth, PEAKS - A system for the automatic evaluation of voice and speech disorders Speech Communication. ,vol. 51, pp. 425- 437 ,(2009) , 10.1016/J.SPECOM.2009.01.004
R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, J.G. Taylor, Emotion recognition in human-computer interaction IEEE Signal Processing Magazine. ,vol. 18, pp. 32- 80 ,(2001) , 10.1109/79.911197
Björn Schuller, Ronald Müller, Florian Eyben, Jürgen Gast, Benedikt Hörnler, Martin Wöllmer, Gerhard Rigoll, Anja Höthker, Hitoshi Konosu, Being bored? Recognising natural interest by extensive audiovisual integration for real-life application Image and Vision Computing. ,vol. 27, pp. 1760- 1774 ,(2009) , 10.1016/J.IMAVIS.2009.02.013
Alex Stupakov, Evan Hanusa, Jeff Bilmes, Dieter Fox, COSINE - A corpus of multi-party COnversational Speech In Noisy Environments international conference on acoustics, speech, and signal processing. pp. 4153- 4156 ,(2009) , 10.1109/ICASSP.2009.4960543
M. Schroder, E. Bevacqua, R. Cowie, F. Eyben, H. Gunes, D. Heylen, M. ter Maat, G. McKeown, S. Pammi, M. Pantic, C. Pelachaud, B. Schuller, E. de Sevin, M. Valstar, M. Wollmer, Building Autonomous Sensitive Artificial Listeners IEEE Transactions on Affective Computing. ,vol. 3, pp. 165- 183 ,(2012) , 10.1109/T-AFFC.2011.34
T. Athanaselis, S. Bakamidis, I. Dologlou, R. Cowie, E. Douglas-Cowie, C. Cox, ASR for emotional speech: Clarifying the issues and enhancing performance Neural Networks. ,vol. 18, pp. 437- 444 ,(2005) , 10.1016/J.NEUNET.2005.03.008