On Applications of Bootstrap in Continuous Space Reinforcement Learning.

作者: George Michailidis , Ambuj Tewari , Mohamad Kazem Shirani Faradonbeh

DOI:

关键词:

摘要: In decision making problems for continuous state and action spaces, linear dynamical models are widely employed. Specifically, policies stochastic systems subject to quadratic cost functions capture a large number of applications in reinforcement learning. Selected randomized have been studied the literature recently that address trade-off between identification control. However, little is known about based on bootstrapping observed states actions. this work, we show bootstrap-based achieve square root scaling regret with respect time. We also obtain results accuracy learning model's dynamics. Corroborative numerical analysis illustrates technical provided.

参考文章(51)
Sham M Kakade, Thomas P Hayes, Varsha Dani, Stochastic Linear Optimization Under Bandit Feedback conference on learning theory. pp. 355- 366 ,(2008)
Csaba Szepesvári, Yasin Abbasi-Yadkori, Regret Bounds for the Adaptive Control of Linear Quadratic Systems conference on learning theory. pp. 1- 26 ,(2011)
M. C. Campi, S. Bittanti, ADAPTIVE CONTROL OF LINEAR TIME INVARIANT SYSTEMS: THE "BET ON THE BEST" PRINCIPLE ∗ Communications in information and systems. ,vol. 6, pp. 299- 320 ,(2006)
Emanuel Todorov, Weiwei Li, Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems. pp. 222- 229 ,(2004)
Peter Dorato, Vito Cerone, Chaouki Abdallah, Linear Quadratic Control: An Introduction ,(2000)
Ian Osband, Benjamin Van Roy, Bootstrapped Thompson Sampling and Deep Exploration arXiv: Machine Learning. ,(2015)
Maurits Kaptein, Dean Eckles, Thompson sampling with the online bootstrap arXiv: Learning. ,(2014)
Lihong Li, Sample Complexity Bounds of Exploration Reinforcement Learning. pp. 175- 204 ,(2012) , 10.1007/978-3-642-27645-3_6
T.L Lai, Asymptotically efficient adaptive control in stochastic regression models Advances in Applied Mathematics. ,vol. 7, pp. 23- 45 ,(1986) , 10.1016/0196-8858(86)90004-7