Memory-based Modeling and Prioritized Sweeping in Reinforcement Learning

作者: M.J.G. Ramakers

DOI:

关键词:

摘要: Reinforcement Learning (RL) is a popular method in machine learning. In RL, an agent learns policy by observing state-transitions and receiving feedback the form of reward signal. The learning problem can be solved interaction with system only, without prior knowledge that system. However, real-time from leads to slow as every time-interval only used observe single state-transition. accelerated using Dyna-style algorithm. This approach real model simultaneously. Our research investigates two aspects this method: Building during implementing into We use memory-based modeling called Local Linear Regression (LLR) build state-transition process. It expected quality increases number observed increase. To assess modeled we introduce prediction intervals. show LLR able various systems, including complex humanoid robot. was added algorithm generate more for learn from. increasing experiences faster Prioritized Sweeping (PS) Look Ahead (LA) Dyna possibilities efficiently. how intervals increase performance algorithms. algorithms were compared inverted pendulum simulation, which had swing-up control task.

参考文章(2)
Stefan Schaal, Christopher G. Atkeson, Robot Learning From Demonstration international conference on machine learning. pp. 12- 20 ,(1997)
A.G. Barto, R.S. Sutton, Reinforcement Learning: An Introduction ,(1988)