Linear inverse reinforcement learning in continuous time and space

作者: Rushikesh Kamalapurkar

DOI: 10.23919/ACC.2018.8431430

关键词: State (functional analysis)Mathematical optimizationAutonomous agentConstant (mathematics)Computer scienceTrajectoryLinear systemEstimatorFunction (mathematics)

摘要: This paper develops a data-driven inverse reinforcement learning technique for class of linear systems to estimate the cost function an agent online, using input-output measurements. A simultaneous state and parameter estimator is utilized facilitate output-feedback learning, estimation achieved up multiplication by constant.

参考文章(26)
Bernard Michini, Jonathan P. How, Bayesian Nonparametric Inverse Reinforcement Learning Machine Learning and Knowledge Discovery in Databases. pp. 148- 163 ,(2012) , 10.1007/978-3-642-33486-3_10
Eyal Amir, Deepak Ramachandran, Bayesian inverse reinforcement learning international joint conference on artificial intelligence. ,vol. 51, pp. 2586- 2591 ,(2007)
Vladlen Koltun, Sergey Levine, Continuous Inverse Optimal Control with Locally Optimal Examples international conference on machine learning. pp. 475- 482 ,(2012)
R. E. Kalman, When Is a Linear Control System Optimal Journal of Basic Engineering. ,vol. 86, pp. 51- 60 ,(1964) , 10.1115/1.3653115
Pieter Abbeel, Andrew Y. Ng, Exploration and apprenticeship learning in reinforcement learning Proceedings of the 22nd international conference on Machine learning - ICML '05. pp. 1- 8 ,(2005) , 10.1145/1102351.1102352
Pieter Abbeel, Andrew Y. Ng, Apprenticeship learning via inverse reinforcement learning Twenty-first international conference on Machine learning - ICML '04. pp. 1- 8 ,(2004) , 10.1145/1015330.1015430
B. Xian, M.S. de Queiroz, D.M. Dawson, M.L. McIntyre, A discontinuous output feedback controller and velocity observer for nonlinear mechanical systems Automatica. ,vol. 40, pp. 695- 700 ,(2004) , 10.1016/J.AUTOMATICA.2003.12.007
E. T. Jaynes, Information Theory and Statistical Mechanics Physical Review. ,vol. 106, pp. 620- 630 ,(1957) , 10.1103/PHYSREV.106.620