RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms

作者： Joelle Pineau , Riashat Islam , Andre Cianflone , Zafarali Ahmed , Khimya Khetarpal

DOI:

关键词: Reproducibility 、 Computer science 、 Machine learning 、 Reinforcement learning 、 Artificial intelligence

摘要: Reinforcement learning (RL) has recently achieved tremendous success in solving complex tasks. Careful considerations are made towards reproducible research in machine learning …

openreview.net 本地加速

openreview.net PDF 下载加速

参考文章(9)

Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556

Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei, ImageNet Large Scale Visual Recognition Challenge International Journal of Computer Vision. ,vol. 115, pp. 211- 252 ,(2015) , 10.1007/S11263-015-0816-Y

Ronald J. Williams, Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning Machine Learning. ,vol. 8, pp. 229- 256 ,(1992) , 10.1007/BF00992696

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236

Martin Riedmiller, Jan Peters, Stefan Schaal, Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. pp. 254- 261 ,(2007) , 10.1109/ADPRL.2007.368196

M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling, The arcade learning environment: an evaluation platform for general agents Journal of Artificial Intelligence Research. ,vol. 47, pp. 253- 279 ,(2013) , 10.1613/JAIR.3912

David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis, None, Mastering the game of Go with deep neural networks and tree search Nature. ,vol. 529, pp. 484- 489 ,(2016) , 10.1038/NATURE16961

Marwin H. S. Segler, Mike Preuss, Mark P. Waller, Planning chemical syntheses with deep neural networks and symbolic AI. Nature. ,vol. 555, pp. 604- 610 ,(2018) , 10.1038/NATURE25978

RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms

来源期刊

我的账户

RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms

来源期刊

相似文章 8

我的账户