作者: Argaman Mordoch , Daniel Portnoy , Roni Stern , Brendan Juba
DOI:
关键词:
摘要: We explore the problem of generating a plan for a team of heterogeneous collaborative agents without knowing their capabilities, but having access to observations of previously executed of plans. To plan for such “black-box” collaborative agents, we present the Planning using Offline Learning (POL) framework. POL compiles the given observations into trajectories of a single “super” agent, and uses an action model learning algorithm to learn the capabilities of that “super” agent. We implemented POL for Multi-agent STRIPS (MASTRIPS) domains, and show that when using the Safe Action Model (SAM) learning algorithm, it is guaranteed to be sound and have a probabilistic form of completeness. Empirically, we evaluate POL over a standard MA-STRIPS benchmark. The results show that an almost perfect action model was learned for all agents with only a few trajectories in most cases. Finally, we discuss how POL and SAM learning can be extended to handle observations with concurrent and possibly conflicting actions.