作者: Silvia Cascianelli , Rita Cucchiara , Lorenzo Baraldi , Marcella Cornia , Federico Landi
DOI:
关键词:
摘要: Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In this paper, we devise a novel embodied setting in which an agent needs explore previously unknown environment while recounting what sees during path. context, navigate driven by exploration goal, select proper moments for description, output natural language descriptions relevant objects scenes. Our model integrates self-supervised module with penalty, fully-attentive captioning explanation. Also, investigate different policies selecting explanation, information coming from both navigation. Experiments are conducted on photorealistic environments Matterport3D dataset navigation explanation capabilities well role their interactions.