Sparse Approximations to Value Functions in Reinforcement Learning

作者: Hunor S. Jakab , Lehel Csató

DOI: 10.1007/978-3-319-09903-3_14

关键词:

摘要: We present a novel sparsification and value function approximation method for on-line reinforcement learning in continuous state action spaces. Our approach is based on the kernel least squares temporal difference algorithm. derive recursive version enhance algorithm with new mechanism topology obtained from proximity graphs. The - necessary to speed up computations favors datapoints minimizing divergence of target-function gradient, thereby also considering shape target function. performance our tested standard benchmark RL problem comparisons existing approaches are provided.

参考文章(30)
Lehel Csató, Gaussian processes:iterative sparse approximations Aston University. ,(2002)
Peter McCullagh, John Ashworth Nelder, Generalized Linear Models ,(1983)
O. Pujol, J.P. Pérez, J.P. Ramis, C. Simó, S. Simon, J.A. Weil, Swinging Atwood Machine: Experimental and numerical results, and a theoretical study Physica D: Nonlinear Phenomena. ,vol. 239, pp. 1067- 1081 ,(2010) , 10.1016/J.PHYSD.2010.02.017
Dietmar Saupe, Mauro R. Ruggeri, Isometry-invariant matching of point set surfaces eurographics. pp. 17- 24 ,(2008) , 10.5555/2381112.2381116
Justin A. Boyan, Least-Squares Temporal Difference Learning international conference on machine learning. pp. 49- 56 ,(1999)
Gavin Taylor, Ronald Parr, Kernelized value function approximation for reinforcement learning Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. pp. 1017- 1024 ,(2009) , 10.1145/1553374.1553504
Steven J. Bradtke, Andrew G. Barto, Linear least-squares algorithms for temporal difference learning Machine Learning. ,vol. 22, pp. 33- 57 ,(1996) , 10.1007/BF00114723