Decision tree methods for finding reusable MDP homomorphisms

作者: Andrew G. Barto , Alicia Peregrin Wolfe

DOI:

关键词: Class (computer programming)Computer scienceState spaceBellman equationDecision treeSample (statistics)Artificial intelligenceHomomorphismState (functional analysis)

摘要: State abstraction is a useful tool for agents interacting with complex environments. Good state abstractions are compact, reuseable, and easy to learn from sample data. This paper combines extends two existing classes of methods achieve these criteria. The first class search MDP homomorphisms (Ravindran 2004), which produce models reward transition probabilities in an abstract space. second methods, like the UTree algorithm (McCallum 1995), compact value function quickly Models based on can easily be extended such that they usable across tasks similar functions. However, cannot this fashion. We present results showing new, combined fulfills all three criteria: resulting learned data, used

参考文章(1)
Richard S. Sutton, Doina Precup, Satinder Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. ,vol. 112, pp. 181- 211 ,(1999) , 10.1016/S0004-3702(99)00052-1