Efficient Reinforcement Learning Using Gaussian Processes

作者: Marc Peter Deisenroth

DOI:

关键词:

摘要: This book examines Gaussian processes in both model-based reinforcement learning (RL) and inference nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL continuous-valued state action spaces when no expert knowledge is available. PILCO takes model uncertainties consistently into account during long-term planning to reduce bias. Second, propose principled algorithms robust filtering smoothing GP systems.

参考文章(178)
Yoshua Bengio, Jérôme Louradour, Ronan Collobert, Jason Weston, Curriculum learning Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. pp. 41- 48 ,(2009) , 10.1145/1553374.1553380
Sebastian Thrun, Wolfram Burgard, Dieter Fox, Probabilistic Robotics ,(2005)
Hagai Attias, Planning by Probabilistic Inference. international conference on artificial intelligence and statistics. ,(2003)
Roderick Murray-Smith, Agathe Girard, Carl Edward Rasmussen, Gaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction ,(2002)
O. Zoeter, A. Ypma, T. Heskes, Improved unscented kalman smoothing for stock volatility estimation international workshop on machine learning for signal processing. pp. 143- 152 ,(2004) , 10.1109/MLSP.2004.1422968
Michael Syskind Pedersen, Kaare Brandt Petersen, The Matrix Cookbook Technical University of Denmark. ,(2006)
J. Ko, D. Fox, Learning GP-BayesFilters via Gaussian process latent variable models robotics science and systems. ,vol. 05, ,(2009) , 10.15607/RSS.2009.V.029
M. Ghavamzadeh, S. Bhatnagar, M. Lee, R.S. Sutton, Natural actorcritic algorithms. Automatica: A journal of IFAC the International Federation of Automatic Control. ,vol. 45, pp. 2471- 2482 ,(2009)
Jean Pierre Aubin, Applied functional analysis ,(1979)
A. O'Hagan, Curve Fitting and Optimal Design for Prediction Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 40, pp. 1- 24 ,(1978) , 10.1111/J.2517-6161.1978.TB01643.X