A menu of designs for reinforcement learning over time

作者: W Thomas Miller , Richard S Sutton , Paul J Werbos

DOI:

关键词: Error-driven learningReinforcement learningProgramming languageCode (cryptography)Simple (abstract algebra)Action (philosophy)Computer scienceDynamic programming

摘要: This chapter contains sections titled: Introduction and Overview, A Simple Two-Component Adaptive Critic Design, HDP and Dynamic Programming, Alternative Ways to Figure 3.2 in …

参考文章(0)