作者: Florian Eyben , Martin Wöllmer , Alex Graves , Björn Schuller , Ellen Douglas-Cowie
DOI: 10.1007/S12193-009-0032-6
关键词: Computer science 、 Recurrent neural nets 、 Long short term memory 、 Time delay neural network 、 Linguistics 、 Emotion recognition 、 Speech recognition 、 Intelligent character recognition 、 Recurrent neural network 、 Speaker recognition
摘要: For many applications of emotion recognition, such as virtual agents, the system must select responses while user is speaking. This requires reliable on-line recognition user’s affect. However most systems are based on turnwise processing. We present a novel approach to from speech using Long Short-Term Memory Recurrent Neural Networks. Emotion recognised frame-wise in two-dimensional valence-activation continuum. In contrast current state-of-the-art approaches, performed low-level signal frames, similar those used for recognition. No statistical functionals applied feature contours. Framing at higher level therefore unnecessary and regression outputs can be produced real-time every input frame. also investigate benefits including linguistic features frame obtained by keyword spotter.