Absorbing and Ergodic Discretized Two-Action Learning Automata

作者: B. Johnoommen

DOI: 10.1109/TSMC.1986.4308951

关键词:

摘要: A learning automaton is a machine that interacts with random environment and simultaneously learns the optimal action offers to it. Learning automata variable structure are considered. Such completely defined by set of probability updating rules. Contrary all variable-structure stochastic (VSSA) discussed in literature, which update probabilities such way an can take any real value interval [0,1], space discretized so as permit assume one finite number distinct values [0,1]. The termed linear or nonlinear depending on whether subintervals [0,1] equal length. It proven 1) two-action reward-inaction absorbing ?-optimal environments; 2) inaction-penalty ergodic expedient 3) artificially created barriers 4) there exist environments. maximum advantage gained rendering finite-state has also been derived.

参考文章(8)
Martin E. Hellman, Thomas M. Cover, Learning with Finite Memory Annals of Mathematical Statistics. ,vol. 41, pp. 765- 782 ,(1970) , 10.1214/AOMS/1177696958
B. J. Oommen, Eldon Hansen, The asymptotic optimality of discretized linear reward-inaction learning automata systems man and cybernetics. ,vol. 14, pp. 542- 545 ,(1984) , 10.1109/TSMC.1984.6313256
S. Lakshmivarahan, A learning approach to the two person decentralized team problem with incomplete information Applied Mathematics and Computation. ,vol. 8, pp. 79- 82 ,(1981) , 10.1016/0096-3003(81)90035-7
M. A. L. Thathachar, B. John Oommen, Learning automata processing ergodicity of the mean: The two-action case systems man and cybernetics. ,vol. 13, pp. 1143- 1148 ,(1983) , 10.1109/TSMC.1983.6313191
Kumpati S. Narendra, M. A. L. Thathachar, Learning Automata - A Survey IEEE Transactions on Systems, Man, and Cybernetics. ,vol. SMC-4, pp. 323- 334 ,(1974) , 10.1109/TSMC.1974.5408453
Yu. A. Flerov, Some Classes of Multi-Input Automata Journal of Information Processing and Cybernetics. ,vol. 2, pp. 112- 122 ,(1972) , 10.1080/01969727208542916
Kumpati S. Narendra, M. A. L. Thathachar, On the Behavior of a Learning Automaton in a Changing Environment with Application to Telephone Traffic Routing IEEE Transactions on Systems, Man, and Cybernetics. ,vol. 10, pp. 262- 269 ,(1980) , 10.1109/TSMC.1980.4308485
Kumpati S. Narendra, E. Allen Wright, Lorne G. Mason, Application of Learning Automata to Telephone Traffic Routing and Control IEEE Transactions on Systems, Man, and Cybernetics. ,vol. 7, pp. 785- 792 ,(1977) , 10.1109/TSMC.1977.4309623