Reinforcement learning as classification: leveraging modern classifiers

DOI:

关键词:

摘要: The basic tools of machine learning appear in the inner loop most reinforcement algorithms, typically form Monte Carlo methods or function approximation techniques. To a large extent, however, current algorithms draw upon techniques that are at least ten years old and, with few exceptions, very little has been done to exploit recent advances classification for purposes learning. We use variant approximate policy iteration based on rollouts allows us pure learner, such as support vector (SVM), algorithm. argue SVMs, particularly combination kernel trick, can make it easier apply an "out-of-the-box" technique, without extensive feature engineering. Our approach opens door modern methods, but does not preclude classical methods. present experimental results pendulum balancing and bicycle riding domains using both SVMs neural networks classifiers.

参考文章(19)

SVMTorch: support vector machines for large-scale regression problems Journal of Machine Learning Research. ,vol. 1, pp. 143- 160 ,(2001) , 10.1162/15324430152733142

Jette Randløv, Preben Alstrøm, Learning to Drive a Bicycle Using Reinforcement Learning and Shaping international conference on machine learning. pp. 463- 471 ,(1998)

Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)

Robert Givan, Alan Fern, Sungwook Yoon, Approximate Policy Iteration with a Policy Language Bias: Learning Control Knowledge Planning in Planning Domains ,(2003)

Sham Kakade, John Langford, Approximately Optimal Approximate Reinforcement Learning international conference on machine learning. pp. 267- 274 ,(2002)

John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)

Robert Givan, Alan Fern, SungWook Yoon, Inductive policy selection for first-order MDPs uncertainty in artificial intelligence. pp. 568- 576 ,(2002)

Andrew Y. Ng, Stuart J. Russell, Daishi Harada, Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping international conference on machine learning. pp. 278- 287 ,(1999)

Leslie Pack Kaelbling, Nils J. Nilsson, Learning in Embedded Systems ,(1993)

10.

H.O. Wang, K. Tanaka, M.F. Griffin, An approach to fuzzy control of nonlinear systems: stability and design issues IEEE Transactions on Fuzzy Systems. ,vol. 4, pp. 14- 23 ,(1996) , 10.1109/91.481841

Reinforcement learning as classification: leveraging modern classifiers

来源期刊

我的账户

Reinforcement learning as classification: leveraging modern classifiers

来源期刊

相似文章 10

我的账户