作者: D. Precup , R. Sutton
DOI:
关键词: Reinforcement learning 、 Series (mathematics) 、 Algorithm 、 Backpropagation 、 Computational complexity theory 、 Function (mathematics) 、 Mathematics 、 Gradient descent 、 Sensitivity (control systems) 、 Stochastic gradient descent 、 Artificial intelligence
摘要: This report describes a series of results using the exponentiated gradient descent (EG) method recently proposed by Kivinen and Warmuth. Prior work is extended comparing speed learning on nonstationary problem an extension to backpropagation networks. Most significantly, we present EG temporal-difference reinforcement learning. compared conventional methods two test problems CMAC function approximators replace traces. On larger problems, average loss was approximately 25% smaller for method. The relative computational complexity parameter sensitivity also discussed.