Distributional Reinforcement Learning in the Brain.

作者: Adam S. Lowet , Qiao Zheng , Sara Matias , Jan Drugowitsch , Naoshige Uchida

DOI: 10.1016/J.TINS.2020.09.004

关键词:

摘要: Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in mammalian midbrain reward prediction errors reinforcement learning algorithms, which express difference actual predicted mean reward. However, it may be advantageous to learn not only but also complete distribution potential rewards. Recent advances machine revealed a biologically plausible set algorithms reconstructing this from experience. Here, we review mathematical foundations these as well initial evidence their neurobiological implementation. We conclude by highlighting outstanding questions regarding circuit computation behavioral readout distributional codes.

参考文章(89)
Robert A Rescorla, None, A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement Classical conditioning II : Current research and theory. pp. 64- 99 ,(1972)
John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)
Michele Poletti, Giulio Perugi, Chiara Logi, Anna Romano, Paolo Del Dotto, Roberto Ceravolo, Giuseppe Rossi, Pasquale Pepe, Liliana Dell'Osso, Ubaldo Bonuccelli, Dopamine agonists and delusional jealousy in Parkinson's disease: A cross‐sectional prevalence study Movement Disorders. ,vol. 27, pp. 1679- 1682 ,(2012) , 10.1002/MDS.25129
A. Castrioto, A. Funkiewiez, B. Debu, R. Cools, E. Lhommee, C. Ardouin, V. Fraix, S. Chabardes, T. W. Robbins, P. Pollak, P. Krack, Iowa gambling task impairment in Parkinson's disease can be normalised by reduction of dopaminergic medication after subthalamic stimulation Journal of Neurology, Neurosurgery, and Psychiatry. ,vol. 86, pp. 186- 190 ,(2015) , 10.1136/JNNP-2013-307146
Joseph W. Kable, Paul W. Glimcher, The Neurobiology of Decision: Consensus and Controversy Neuron. ,vol. 63, pp. 733- 745 ,(2009) , 10.1016/J.NEURON.2009.09.003
Alexandre Pouget, Jeffrey M Beck, Wei Ji Ma, Peter E Latham, None, Probabilistic brains: knowns and unknowns Nature Neuroscience. ,vol. 16, pp. 1170- 1178 ,(2013) , 10.1038/NN.3495
Michele Poletti, Paolo Cavedini, Ubaldo Bonuccelli, Iowa gambling task in Parkinson's disease. Journal of Clinical and Experimental Neuropsychology. ,vol. 33, pp. 395- 409 ,(2011) , 10.1080/13803395.2010.524150
Y. Li, J. T. Dudman, Mice infer probabilistic models for timing Proceedings of the National Academy of Sciences of the United States of America. ,vol. 110, pp. 17154- 17159 ,(2013) , 10.1073/PNAS.1310666110