Successor Features for Transfer in Reinforcement Learning

作者: Tom Schaul , Rémi Munos , Jonathan J. Hunt , David Silver , Will Dabney

DOI:

关键词:

摘要: Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose transfer framework for scenario where reward function changes between tasks environment's dynamics remain same. Our approach rests on two key ideas: "successor features", value representation decouples of environment from rewards, and "generalized policy improvement", dynamic programming's improvement operation considers set policies rather than single one. Put together, ideas lead an integrates seamlessly allows free exchange information The proposed method provides performance guarantees transferred even before any has taken place. derive theorems our firm theoretical ground present experiments show it successfully promotes practice, significantly outperforming alternative methods sequence navigation control simulated robotic arm.

参考文章(17)
Tom Schaul, Daniel Horgan, David Silver, Karol Gregor, Universal Value Function Approximators international conference on machine learning. pp. 1312- 1320 ,(2015)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802
Mehran Asadi, Manfred Huber, Effective control knowledge transfer through learning skill and representation hierarchies international joint conference on artificial intelligence. pp. 2054- 2059 ,(2007)
Andrew Y Ng, Stuart Russell, None, Algorithms for Inverse Reinforcement Learning international conference on machine learning. ,vol. 67, pp. 663- 670 ,(2000) , 10.2460/AJVR.67.2.323
Alexander L. Strehl, Michael L. Littman, A theoretical analysis of Model-Based Interval Estimation Proceedings of the 22nd international conference on Machine learning - ICML '05. pp. 856- 863 ,(2005) , 10.1145/1102351.1102459
Peter Stone, Matthew E. Taylor, Transfer Learning for Reinforcement Learning Domains: A Survey Journal of Machine Learning Research. ,vol. 10, pp. 1633- 1685 ,(2009) , 10.5555/1577069.1755839
C.G. Atkeson, J.C. Santamaria, A comparison of direct and model-based reinforcement learning Proceedings of International Conference on Robotics and Automation. ,vol. 4, pp. 3557- 3564 ,(1997) , 10.1109/ROBOT.1997.606886
Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, Michael L. Littman, An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 752- 759 ,(2008) , 10.1145/1390156.1390251