Sequential Decision Problems and Neural Networks

作者： A. G. Barto , R. S. Sutton , C. J. C. H. Watkins

DOI:

关键词:

摘要: Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement in terms framework shows how a algorithm similar one implemented Adaptive Critic Element used pole-balancer Barto, Sutton, Anderson (1983), further developed Sutton (1984), fits into this framework. neural networks play significant roles modules for approximating functions required solving problems.

参考文章(14)

Charles William Anderson, Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) University of Massachusetts Amherst. ,(1986)

Richard Stuart Sutton, Temporal credit assignment in reinforcement learning University of Massachusetts Amherst. ,(1984)

Michael R. Hilliard, Gunar E. Liepins, Gita Rangarajan, Mark Palmer, Alternatives for classifier system credit assignment international joint conference on artificial intelligence. pp. 756- 761 ,(1989)

Charles W. Anderson, Learning and Problem Solving with Multilayer Connectionist Systems University Microfilms International. ,(1986)

Paul Werbos, Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research systems man and cybernetics. ,vol. 17, pp. 7- 20 ,(1987) , 10.1109/TSMC.1987.289329

D. P. Bertsekas, Chelsea C. White, Dynamic Programming and Stochastic Control IEEE Transactions on Systems, Man, and Cybernetics. ,vol. 7, pp. 758- 759 ,(1977) , 10.1109/TSMC.1977.4309612

Sheldon M. Ross, Introduction to Stochastic Dynamic Programming ,(2014)

Ian H. Witten, An adaptive optimal controller for discrete-time Markov environments Information and Control. ,vol. 34, pp. 286- 295 ,(1977) , 10.1016/S0019-9958(77)90354-0

Andrew G. Barto, Richard S. Sutton, Charles W. Anderson, Neuronlike adaptive elements that can solve difficult learning control problems systems man and cybernetics. ,vol. 13, pp. 834- 846 ,(1983) , 10.1109/TSMC.1983.6313077

10.

Richard S. Sutton, Learning to Predict by the Methods of Temporal Differences Machine Learning. ,vol. 3, pp. 9- 44 ,(1988) , 10.1023/A:1022633531479

Sequential Decision Problems and Neural Networks

来源期刊

我的账户

Sequential Decision Problems and Neural Networks

来源期刊

相似文章 10

我的账户