Model-based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration

作者: Pieter Abbeel , Teodor Moldovan , Sergey Levine , Sachin Patil , Christopher Xie

DOI:

关键词:

摘要: In this paper, we present a robotic model-based reinforcement learning method that combines ideas from model identification and predictive control. We use feature-based representation of the dynamics allows to be fitted with simple least squares procedure, features are identified high-level specification robot's morphology, consisting number connectivity structure its links. Model control is then used choose actions under an optimistic dynamics, which produces efficient goal-directed exploration strategy. real time experimental results on standard benchmark problems involving pendulum, cartpole, double pendulum systems. Experiments indicate our able learn range tasks substantially faster than previous best methods. To evaluate approach realistic task, also demonstrate simulated 7 degree freedom arm.

参考文章(1)
Yuval Tassa, Tom Erez, Emanuel Todorov, Synthesis and stabilization of complex behaviors through online trajectory optimization intelligent robots and systems. pp. 4906- 4913 ,(2012) , 10.1109/IROS.2012.6386025