作者: Thomas Di Giacomo , Stephane Garchery , Nadia Magnenat-Thalmann
DOI: 10.1007/978-1-84628-907-1_2
关键词:
摘要: With the emergence of 3D graphics, we are now able to create very realistic characters that can move and talk. Multimodal interaction with such is also possible, as various technologies have matured for speech video analysis, natural language dialogues, animation. However, behavior expressed by these far from believable in most systems. We feel this problem arises due their lack individuality on levels: perception, dialogue, expression. In chapter, describe results research tries realistically connect personality characters, not only an expressive level (for example, generating individualized expressions a face), but real-time tracking, dialogue (generating responses actually correspond what certain emotional state would say) perceptive (having virtual character uses expression user data corresponding behavior). The idea linking agent has been discussed Marsella et al. [33], influence emotion general, Johns [21] how affect decision making. Traditionally, any text or voice-driven animation system phonemes basic units speech, visemes Though text-to-speech synthesizers phoneme recognizers often use biphonebased techniques, end seldom access information, except dedicated Most commercially freely available software applications allow time-stamped streams along audio. Thus, order generate extra processing, namely co-articulation, required. This process takes care neighboring fluent production. processing stage be eliminated using syllable unit rather than phoneme. Overall, do intend give complete survey ongoing behavior, emotion, personality. Our main goal conversational agents interact many modalities. thus concentrate extraction real (Section 2.3), visyllable-based 2.4), systems emotions 2.5).