作者: G. Infantes , I. Ferrané , B. Burger , F. Lerasle
DOI:
关键词:
摘要: Abstract: We designed an easy-to-use user interface based on speech and gesture modalities for controling interactive robot. This paper, after a brief description of this the platform which it is implemented, describes embedded recognition system part multimodal interface. describe two methods, namely Hidden Markov Models Dynamic Bayesian Networks, discuss their relative performance task in our Human-Robot interaction context. The implementation DBN-based outlined some quantitative results are shown. I. INTRODUCTIONSince assistant robots to directly interact with people, finding natural interfaces fundamental importance [1]. Nevertheless, few robotic systems currently equipped completely on-board enabling robot control through communication channels like speech, or both. most advanced one [2] constraint 3D pointing gestures has been developed, but limited mono-manual gestures. In other works, [3] [4], often extracted from monocular images, loosing depth information thus losing capability dealing than directional. With intention providing called Jido such interface, we developed both as well module fusing these results. merging step enables to:− complete underspecified sentence, abbreviation omission, usual human particularly if can be done even used instead− strengthen each modality by improving classification rates commands thanks probabilistic merge results.In framework, paper focuses one- two-handed given video stream delivered stereo head, physical constraints imposed autonomous background: mobility platform, shared computational power, memory capacities, etc.First section background it, leading explanation needs recognition. Next, (HMM) Networks (DBN) task, output visual tracker devoted upper body extremities [5]. Then, outlined. more precisely data clustering process carried out Kohonen network, model training made means Expectation-Maximization algorithm performed using particle filtering [6]. Finally, qualitative symbolic deictic database presented. DBN representation, commonly activity recognition, shown outperform HMM representation especially terms CPU time consuming segmentation.