Learning Shared Representations in Multi-task Reinforcement Learning

作者: Thore Graepel , John Shawe-Taylor , Diana Borsa

DOI:

关键词:

摘要: We investigate a paradigm in multi-task reinforcement learning (MT-RL) which an agent is placed environment and needs to learn perform series of tasks, within this space. Since the does not change, there potentially lot common ground amongst tasks solve them individually seems extremely wasteful. In paper, we explicitly model shared structure as it arises state-action value will show how one can jointly optimal value-functions by modifying popular Value-Iteration Policy-Iteration procedures accommodate representation assumption leverage power supervised learning. Finally, demonstrate that proposed training procedures, are able infer good functions, even under low samples regimes. addition data efficiency, our analysis, abstractions state space across leads more robust, transferable representations with potential for better generalization.

参考文章(17)
Tom Schaul, Daniel Horgan, David Silver, Karol Gregor, Universal Value Function Approximators international conference on machine learning. pp. 1312- 1320 ,(2015)
George Konidaris, Andrew Barto, Building portable options: skill transfer in reinforcement learning international joint conference on artificial intelligence. pp. 895- 900 ,(2007)
Martin Stolle, Doina Precup, Learning Options in Reinforcement Learning symposium on abstraction, reformulation and approximation. pp. 212- 223 ,(2002) , 10.1007/3-540-45622-8_16
Andrew G. Barto, Sridhar Mahadevan, Recent Advances in Hierarchical Reinforcement Learning Discrete Event Dynamic Systems. ,vol. 13, pp. 41- 77 ,(2003) , 10.1023/A:1022140919877
Bernhard Hengst, Discovering Hierarchy in Reinforcement Learning with HEXQ international conference on machine learning. pp. 243- 250 ,(2002)
Andreas Argyriou, Theodoros Evgeniou, Massimiliano Pontil, Convex multi-task feature learning Machine Learning. ,vol. 73, pp. 243- 272 ,(2008) , 10.1007/S10994-007-5040-8
Peter Stone, Matthew E. Taylor, Transfer Learning for Reinforcement Learning Domains: A Survey Journal of Machine Learning Research. ,vol. 10, pp. 1633- 1685 ,(2009) , 10.5555/1577069.1755839
Ilya Scheidwasser, George Konidaris, Andrew G. Barto, Transfer in reinforcement learning via shared features Journal of Machine Learning Research. ,vol. 13, pp. 1333- 1371 ,(2012) , 10.5555/2188385.2343689
Richard S. Sutton, Doina Precup, Satinder Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. ,vol. 112, pp. 181- 211 ,(1999) , 10.1016/S0004-3702(99)00052-1