作者: Behrad Toghi , Rodolfo Valiente , Dorsa Sadigh , Ramtin Pedarsani , Yaser P Fallah
DOI:
关键词:
摘要: … We introduce a multi-agent variant of the synchronous Advantage Actor-Critic (A2C) algorithm and train agents that coordinate with each other and can affect the behavior of human …