How to Dynamically Merge Markov Decision Processes

作者： Satinder P. Singh , David Cohn

DOI:

关键词: Dynamic programming 、 Computer science 、 Mathematical optimization 、 Markov decision process 、 Partially observable Markov decision process 、 Merge (version control)

摘要: We are frequently called upon to perform multiple tasks that compete for our attention and resource. Often we know the optimal solution each task in isolation; this paper, describe how knowledge can be exploited efficiently find good solutions doing parallel. formulate problem as of dynamically merging Markov decision processes (MDPs) into a composite MDP, present new theoretically-sound dynamic programming algorithm finding an policy MDP. analyze various aspects illustrate its use on simple problem.

参考文章(7)

John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)

Leslie Pack Kaelbling, Nils J. Nilsson, Learning in Embedded Systems ,(1993)

Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh, Learning to act using real-time dynamic programming Artificial Intelligence. ,vol. 72, pp. 81- 138 ,(1995) , 10.1016/0004-3702(94)00011-O

Andrew W. Moore, Christopher G. Atkeson, Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time Machine Learning. ,vol. 13, pp. 103- 130 ,(1993) , 10.1023/A:1022635613229

Dimitri P. Bertsekas, Dynamic Programming and Optimal Control Athena Scientific. ,(1995)

Thomas G. Dietterich, Wei Zhang, High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network neural information processing systems. ,vol. 8, pp. 1024- 1030 ,(1995)

C. J. C. H. Watkins, Learning from delayed rewards Ph. D thesis, Cambridge University Psychology Department. ,(1989)

How to Dynamically Merge Markov Decision Processes

来源期刊

我的账户

How to Dynamically Merge Markov Decision Processes

来源期刊

相似文章 10

我的账户