作者: Shlomo Zilberstein , Akshat Kumar
关键词:
摘要: Decentralized POMDPs provide a rigorous framework for multi-agent decision-theoretic planning. However, their high complexity has limited scalability. In this work, we present promising new class of algorithms based on probabilistic inference infinite-horizon ND-POMDPs---a restricted Dec-POMDP model. We first transform the policy optimization problem to that likelihood maximization in mixture dynamic Bayes nets (DBNs). then develop Expectation-Maximization (EM) algorithm maximizing representation. The EM ND-POMDPs lends itself naturally simple message-passing paradigm guided by agent interaction graph. It is thus highly scalable w.r.t. number agents, can be easily parallelized, and produces good quality solutions.