Control strategies for a stochastic planner

作者: Stuart Russell , Jonathan Tash

DOI:

关键词:

摘要: We present new algorithms for local planning over Markov decision processes. The base-level algorithm possesses several interesting features control of computation, based on selecting computations according to their expected benefit quality. are shown expand the agent's knowledge where world warrants it, with appropriate responsiveness time pressure and randomness. then develop an introspective algorithm, using internal representation what computational work has already been done. This strategy extends base warranted by model put into various parts this model. It also enables agent act so as take advantage savings inherent in staying known state space. flexibility provided strategy, incorporating natural problem-solving methods, directs effort towards it's needed better than previous approaches, providing greater hopes scalability large domains.

参考文章(9)
Eric B. Baum, On optimal game tree propagation for imperfect players national conference on artificial intelligence. pp. 507- 512 ,(1992)
Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, Ann Nicholson, Deliberation scheduling for time-critical sequential decision making uncertainty in artificial intelligence. pp. 309- 316 ,(1993) , 10.1016/B978-1-4832-1451-1.50042-1
Richard S. Sutton, Planning by Incremental Dynamic Programming Machine Learning Proceedings 1991. pp. 353- 357 ,(1991) , 10.1016/B978-1-55860-200-7.50073-8
Sylvie Thiébaux, Joachim Hertzberg, William Shoaff, Moti Schneider, A stochastic model of actions and plans for anytime planning under uncertainty International Journal of Intelligent Systems. ,vol. 10, pp. 155- 183 ,(1995) , 10.1002/INT.4550100202
Richard S. Sutton, Learning to Predict by the Methods of Temporal Differences Machine Learning. ,vol. 3, pp. 9- 44 ,(1988) , 10.1023/A:1022633531479
Thomas Dean, Mark Boddy, An analysis of time-dependent planning national conference on artificial intelligence. pp. 49- 54 ,(1988)
Sven Koenig, Optimal Probabilistic and Decision-Theoretic Planning using Markovian University of California at Berkeley. ,(1992)
Stuart Russell, Eric Wefald, Do the right thing ,(1991)