作者: Vasu Sharma , Prasoon Goyal , Kaixiang Lin , Govind Thattai , Qiaozi Gao
DOI:
关键词:
摘要: We propose a multimodal (vision-and-language) benchmark for cooperative and heterogeneous multi-agent learning. We introduce a benchmark multimodal dataset with …