作者: Tom Schaul , Rémi Munos , Jonathan J. Hunt , David Silver , Will Dabney
DOI:
关键词:
摘要: Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose transfer framework for scenario where reward function changes between tasks environment's dynamics remain same. Our approach rests on two key ideas: "successor features", value representation decouples of environment from rewards, and "generalized policy improvement", dynamic programming's improvement operation considers set policies rather than single one. Put together, ideas lead an integrates seamlessly allows free exchange information The proposed method provides performance guarantees transferred even before any has taken place. derive theorems our firm theoretical ground present experiments show it successfully promotes practice, significantly outperforming alternative methods sequence navigation control simulated robotic arm.