作者: Andrew G. Barto , Theodore J. Perkins
DOI:
关键词:
摘要: Lyapunov analysis is a standard approach to studying the stability of dynamical systems and designing controllers. We propose design actions reinforcement learning (RL) agent be descending on function. For minimum cost-to-target problems, this has theoretical benefit guaranteeing that will reach goal state every trial, regardless RL algorithm it uses. In practice, Lyapunov-descent constraints can significantly shorten trials, improve initial worst-case performance, accelerate learning. Although method constraining may limit extent which an minimize cost, allows one construct robust for problems in domain knowledge available. This includes many important individual as well general classes such control feedback linearizable (e.g., industrial robots) continuous-state path-planning problems. demonstrate two simulated problems: pendulum swing-up robot arm control.