Alignment of Speech and Co-speech Gesture in a Constraint-based Grammar

作者: Katya Alahverdzhieva , Katya Saint Amand , Katya Saint-Amand

DOI:

关键词: PhraseSpeech recognitionNatural language processingPragmaticsGestureComputer scienceSyntaxSemanticsGrammarArtificial intelligenceGesture recognitionProsody

摘要: This thesis concerns the form-meaning mapping of multimodal communicative actions consisting speech signals and improvised co-speech gestures, produced spontaneously with hand. The interaction between speech-accompanying gestures has been standardly addressed from a cognitive perspective to establish underlying mechanisms for synchronous gesture production, also computational build computer systems that communicate through multiple modalities. Based on findings this previous research, we advance new theory in which form combined speech-and-gesture signal its meaning is analysed constraint-based grammar. We propose several construction rules about well-formedness motivate empirically an extensive detailed corpus study. In particular, use prosody, syntax semantics speech, signal, as well temporal performance relative constrain derivation single tree turn determines representation via standard semantic composition. Gestural often underspecifies meaning, so output our grammar underspecified logical formulae support range possible interpretations act final context-of-use, given current models semantics/pragmatics interface. It held community co-expressivity determined basis their co-occurrence: is, semantically related happened at same time gesture. Whereas usually taken granted, methodology establishing systematic domain-independent way spoken element(s) can be to, based form, yield supports intended interpretation(s) context. ‘semantic’ alignment thus driven not co-occurrence alone, but linguistic properties overlaps with. doing, contribute fine-grained system articulating uses methods linguistics. show just language exhibits ambiguity both do actions: instance, integration restricted unique phrase rather aligned iii trees yielding distinct representations. These mappings stem fact derived highly incomplete even An overall challenge account action context using linguistics syntactic

参考文章(129)
Stefan Kopp, Kirsten Bergmann, Co-expressivity of Speech and Gesture: Lessons for Models of Aligned Speech and Gesture Production Symposium at the AISB Annual Convention: Language, Speech and Gesture for Expressive Characters. pp. 158- ,(2007)
Berthold Crysmann, Ulrich Callmeier, Stephan Oepen, Peter Adolphs, Bernd Kiefer, Dan Flickinger, Some Fine Points of Hybrid Natural Language Parsing. language resources and evaluation. ,(2008)
Michael Johnston, Multimodal language processing. conference of the international speech communication association. ,(1998)
Dan I Slobin, None, From “thought and language” to “thinking for speaking” Cambridge University Press. ,(1996)
Alex Lascarides, Matthew Stone, Formal Semantics for Iconic Gesture ,(2006)
Adam Kendon, Some Relationships Between Body Motion and Speech Studies in Dyadic Communication. pp. 177- 210 ,(1972) , 10.1016/B978-0-08-015867-9.50013-7
Robert M. Krauss, Yihsiu Chen, Rebecca F. Gottesman, Language and Gesture: Lexical gestures and lexical access: a process model Cambridge University Press. pp. 261- 283 ,(2000) , 10.1017/CBO9780511620850.017
Fred Lakin, Visual grammars for visual languages national conference on artificial intelligence. pp. 683- 688 ,(1987)
Adam Kendon, Gesticulation and Speech: Two Aspects of the Process of Utterance De Gruyter Mouton. pp. 207- 228 ,(2011)
Srinivas Bangalore, Michael Johnston, Integrating multimodal language processing with speech recognition. conference of the international speech communication association. pp. 126- 129 ,(2000)