Reinforcement learning for fuzzy agents: application to a pighouse environment control

作者: Lionel Jouffe

DOI: 10.1007/978-3-7908-1803-1_7

关键词:

摘要: Fuzzy Actor-Critic Learning (FACL) and Q-learning (FQL) are reinforcement learning methods based on Dynamic Programming (DP) principles. In this chapter, they used to tune line the conclusion part of Inference Systems (FIS). The only information available for is system feedback, which describes in terms reward punishment task fuzzy agent has realize. At each time step, receives a reinstate. problem involves optimizing not direct reinforcement, but also total amount reinforcements can receive future. To illustrate use these two methods, we first applied them have find controller drive boat from one bank another, across river with strong non-linear current. Then, well-known Cart-Pole Balancing Mountain-Car problems be able compare our other focus important characteristic aspects FACL FQL. experimental studies had shown superiority respect related literature. We found that generic allow us learn every kind (continuous states, discrete/continuous actions, various types functions). Thanks flexibility, been successfully an industrial problem, discover policy pighouse environment control.

参考文章(6)
B. WIDROW, M. E. HOFF, Adaptive switching circuits Neurocomputing: foundations of research. pp. 123- 134 ,(1988) , 10.21236/AD0241531
Peter Dayan, Terrence J. Sejnowski, TD(λ) Converges with Probability 1 Machine Learning. ,vol. 14, pp. 295- 301 ,(1994) , 10.1023/A:1022657612745
Michael I. Jordan, David E. Rumelhart, Forward Models: Supervised Learning with a Distal Teacher Cognitive Science. ,vol. 16, pp. 307- 354 ,(1992) , 10.1207/S15516709COG1603_1
Tommi Jaakkola, Michael Jordan, Satinder Singh, None, Convergence of Stochastic Iterative Dynamic Programming Algorithms neural information processing systems. ,vol. 6, pp. 703- 710 ,(1993) , 10.1162/NECO.1994.6.6.1185