Message-passing algorithms for large structured decentralized POMDPs

作者: Shlomo Zilberstein , Akshat Kumar

DOI: 10.5555/2034396.2034431

关键词:

摘要: Decentralized POMDPs provide a rigorous framework for multi-agent decision-theoretic planning. However, their high complexity has limited scalability. In this work, we present promising new class of algorithms based on probabilistic inference infinite-horizon ND-POMDPs---a restricted Dec-POMDP model. We first transform the policy optimization problem to that likelihood maximization in mixture dynamic Bayes nets (DBNs). then develop Expectation-Maximization (EM) algorithm maximizing representation. The EM ND-POMDPs lends itself naturally simple message-passing paradigm guided by agent interaction graph. It is thus highly scalable w.r.t. number agents, can be easily parallelized, and produces good quality solutions.

参考文章(8)
Amos Storkey, Stefan Harmeling, Marc Toussaint, Probabilistic inference for solving (PO) MDPs School of Informatics, Institute for Adaptive and Neural Computation. ,(2006)
Shlomo Zilberstein, Akshat Kumar, Event-detecting multi-agent MDPs: complexity and constant-factor approximation international joint conference on artificial intelligence. pp. 201- 207 ,(2009)
Shlomo Zilberstein, Akshat Kumar, Anytime planning for decentralized POMDPs using expectation maximization uncertainty in artificial intelligence. pp. 294- 301 ,(2010)
Pradeep Varakantham, Ranjit Nair, Milind Tambe, Makoto Yokoo, Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs national conference on artificial intelligence. pp. 133- 139 ,(2005)
Shlomo Zilberstein, Akshat Kumar, Constraint-based dynamic programming for decentralized POMDPs with structured interactions adaptive agents and multi agents systems. pp. 561- 568 ,(2009)
Victor Lesser, Milind Tambe, Charles L. Ortiz, Distributed Sensor Networks: A Multiagent Perspective Kluwer Academic Publishers. ,(2003)
Daniel S. Bernstein, Robert Givan, Neil Immerman, Shlomo Zilberstein, The Complexity of Decentralized Control of Markov Decision Processes Mathematics of Operations Research. ,vol. 27, pp. 819- 840 ,(2002) , 10.1287/MOOR.27.4.819.297
Shlomo Zilberstein, Victor Lesser, Claudia V. Goldman, Raphen Becker, Solving transition independent decentralized Markov decision processes Journal of Artificial Intelligence Research. ,vol. 22, pp. 423- 455 ,(2004) , 10.5555/1622487.1622500