Reinforcement Learning Transfer Based on Subgoal Discovery and Subtask Similarity Hao Wang Shunguo Fan Jinhua Song Yang Gao Xingguo Chen

作者: Hao Wang , Yang Gao , Xingguo Chen

DOI:

关键词: Transfer of learningReinforcement learningLearning classifier systemSemi-supervised learningMachine learningMulti-task learningTemporal difference learningUnsupervised learningArtificial intelligenceInstance-based learningEngineering

摘要: This paper studies the problem of transfer learning in context reinforcement learning. We propose a novel method that can speed up learn- ing with aid previously learnt tasks. Before performing extensive episodes, our attempts to analyze task via some exploration environment, and then reuse previous experience whenever it is possible appropriate. In particular, proposed consists four stages: 1) subgoal discovery, 2) option construc- tion, 3) similarity searching, 4) reusing. Especially, order fulfill identifying similar options, we measure between which built upon intuition options have state- action probabilities. examine algorithm using experiments, comparing existing methods. The results show outperforms conventional non-transfer algorithms, as well methods, by wide margin.

参考文章(18)
Andrew G. Barto, Alicia Peregrin Wolfe, Decision tree methods for finding reusable MDP homomorphisms national conference on artificial intelligence. pp. 530- 535 ,(2006)
Satinder Singh, Vishal Soni, Using Homomorphisms to transfer options across continuous reinforcement learning domains national conference on artificial intelligence. pp. 494- 499 ,(2006)
Martin Stolle, Doina Precup, Learning Options in Reinforcement Learning symposium on abstraction, reformulation and approximation. pp. 212- 223 ,(2002) , 10.1007/3-540-45622-8_16
Balaraman Ravindran, Andrew G. Barto, An algebraic approach to abstraction in reinforcement learning University of Massachusetts Amherst. ,(2004)
Alessandro Lazaric, Marcello Restelli, Andrea Bonarini, Transfer of samples in batch reinforcement learning Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 544- 551 ,(2008) , 10.1145/1390156.1390225
Özgür Şimşek, Alicia P. Wolfe, Andrew G. Barto, Identifying useful subgoals in reinforcement learning by local graph partitioning Proceedings of the 22nd international conference on Machine learning - ICML '05. pp. 816- 823 ,(2005) , 10.1145/1102351.1102454
Fei Chen, Yang Gao, Shifu Chen, Zhenduo Ma, Connect-Based Subgoal Discovery for Options in Hierarchical Reinforcement Learning international conference on natural computation. ,vol. 4, pp. 698- 702 ,(2007) , 10.1109/ICNC.2007.312
Richard S. Sutton, Doina Precup, Satinder Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. ,vol. 112, pp. 181- 211 ,(1999) , 10.1016/S0004-3702(99)00052-1
Manfred Huber, Sandeep Goel, Subgoal Discovery for Hierarchical Reinforcement Learning Using Learned Policies the florida ai research society. pp. 346- 350 ,(2003)
Matthew E. Taylor, Peter Stone, An Introduction to Intertask Transfer for Reinforcement Learning Ai Magazine. ,vol. 32, pp. 15- 34 ,(2011) , 10.1609/AIMAG.V32I1.2329