Partially Observable Markov Decision Processes

作者: Matthijs T. J. Spaan

DOI: 10.1007/978-3-642-27645-3_12

关键词:

摘要: For reinforcement learning in environments which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In problem domains, however, suffers from limited sensing capabilities that preclude it recovering Markovian signal its perceptions. Extending MDP framework, partially observable processes (POMDPs) allow for principled making under conditions of uncertain sensing. this chapter we present POMDP model by focusing differences with fully MDPs, and show how optimal policies POMDPs can be represented. Next, give review model-based techniques policy computation, followed overview available model-free POMDPs. We conclude highlighting recent trends learning.

参考文章(119)
Satinder P. Singh, John Loch, Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes international conference on machine learning. pp. 323- 331 ,(1998)
Ronen I. Brafman, Pascal Poupart, Guy Shani, Solomon E. Shimony, Efficient ADD operations for point-based algorithms international conference on automated planning and scheduling. pp. 330- 337 ,(2008)
Sven Koenig, Reid Simmons, Probabilistic robot navigation in partially observable environments international joint conference on artificial intelligence. pp. 1080- 1087 ,(1995)
Stuart Russell, Ronald Parr, Approximating optimal policies for partially observable stochastic domains international joint conference on artificial intelligence. pp. 1088- 1094 ,(1995)
Eric A. Hansen, Zhengzhu Feng, Dynamic programming for POMDPs using a factored state representation international conference on artificial intelligence planning systems. pp. 130- 139 ,(2000)
Anthony Rocco Cassandra, Leslie Pack Kaelbling, Exact and approximate algorithms for partially observable markov decision processes Brown University. ,(1998)
Darius Braziunas, Craig Boutilier, Stochastic local search for POMDP controllers national conference on artificial intelligence. pp. 690- 696 ,(2004)
Edward Jay Sondik, The optimal control of partially observable Markov processes UMI Dissertation Service. ,(1971)