Reinforcement learning as classification: leveraging modern classifiers

作者: Michail G. Lagoudakis , Ronald Parr

DOI:

关键词:

摘要: The basic tools of machine learning appear in the inner loop most reinforcement algorithms, typically form Monte Carlo methods or function approximation techniques. To a large extent, however, current algorithms draw upon techniques that are at least ten years old and, with few exceptions, very little has been done to exploit recent advances classification for purposes learning. We use variant approximate policy iteration based on rollouts allows us pure learner, such as support vector (SVM), algorithm. argue SVMs, particularly combination kernel trick, can make it easier apply an "out-of-the-box" technique, without extensive feature engineering. Our approach opens door modern methods, but does not preclude classical methods. present experimental results pendulum balancing and bicycle riding domains using both SVMs neural networks classifiers.

参考文章(19)
SVMTorch: support vector machines for large-scale regression problems Journal of Machine Learning Research. ,vol. 1, pp. 143- 160 ,(2001) , 10.1162/15324430152733142
Jette Randløv, Preben Alstrøm, Learning to Drive a Bicycle Using Reinforcement Learning and Shaping international conference on machine learning. pp. 463- 471 ,(1998)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)
Sham Kakade, John Langford, Approximately Optimal Approximate Reinforcement Learning international conference on machine learning. pp. 267- 274 ,(2002)
John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)
Robert Givan, Alan Fern, SungWook Yoon, Inductive policy selection for first-order MDPs uncertainty in artificial intelligence. pp. 568- 576 ,(2002)
Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)
Leslie Pack Kaelbling, Nils J. Nilsson, Learning in Embedded Systems ,(1993)
H.O. Wang, K. Tanaka, M.F. Griffin, An approach to fuzzy control of nonlinear systems: stability and design issues IEEE Transactions on Fuzzy Systems. ,vol. 4, pp. 14- 23 ,(1996) , 10.1109/91.481841