Apprenticeship learning and reinforcement learning with application to robotic control

DOI:

关键词:

摘要: Many problems in robotics have unknown, stochastic, high-dimensional, and highly nonlinear dynamics, offer significant challenges to both traditional control methods reinforcement learning algorithms. Some of the key difficulties that arise these are: (i) It is often difficult write down, closed form, a formal specification task. For example, what objective function for "flying well"? (ii) build good dynamics model because data collection modeling (similar "exploration problem" learning). (iii) computationally expensive find closed-loop controllers high dimensional, stochastic domains. We describe algorithms with performance guarantees which show can be efficiently addressed apprenticeship setting—the setting when expert demonstrations task are available. Our guaranteed return policy comparable expert's. We evaluate on same (typically high-dimensional non-linear) environment as expert. Besides having theoretical guarantees, our also enabled us solve some previously unsolved real-world problems: They quadruped robot traverse challenging, unseen terrain. significantly extended state-of-the-art autonomous helicopter flight. has performed by far most challenging aerobatic maneuvers any date, including such continuous in-place flips, rolls tic-tocs, only exceptional human pilots fly. flight best pilots.

acm.org LINK 下载加速

参考文章(77)

Kevin L. Moore, Iterative Learning Control: An Expository Overview Applied and Computational Control, Signals, and Circuits. pp. 151- 214 ,(1999) , 10.1007/978-1-4612-0571-5_4

Mark B. Tischler, Mavis G. Cauffman, Frequency-Response Method for Rotorcraft System Identification: Flight Applications to BO 105 Coupled Rotor/Fuselage Dynamics Journal of The American Helicopter Society. ,vol. 37, pp. 3- 17 ,(1992) , 10.4050/JAHS.37.3

Arthur Gelb, Applied Optimal Estimation ,(1974)

H. H. Rosenbrock, D. H. Jacobson, D. Q. Mayne, Differential Dynamic Programming The Mathematical Gazette. ,vol. 56, pp. 78- ,(1972) , 10.2307/3613752

Thomas D Gillispie, None, Fundamentals of Vehicle Dynamics ,(1992)

Jette Randløv, Preben Alstrøm, Learning to Drive a Bicycle Using Reinforcement Learning and Shaping international conference on machine learning. pp. 463- 471 ,(1998)

Claude Sammut, Scott Hurst, Dana Kedzier, Donald Michie, Learning to fly international conference on machine learning. pp. 385- 393 ,(1992) , 10.1016/B978-1-55860-247-2.50055-3

John B. Moore, Brian D. O. Anderson, Optimal Control: Linear Quadratic Methods ,(1979)

Lawrence K Saul, Michael I Jordan, None, Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones Machine Learning. ,vol. 37, pp. 75- 87 ,(1999) , 10.1023/A:1007649326333

10.

Stefan Schaal, Christopher G. Atkeson, Robot Learning From Demonstration international conference on machine learning. pp. 12- 20 ,(1997)

Apprenticeship learning and reinforcement learning with application to robotic control

来源期刊

我的账户

Apprenticeship learning and reinforcement learning with application to robotic control

来源期刊

相似文章 10

我的账户