RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms

作者: Joelle Pineau , Riashat Islam , Andre Cianflone , Zafarali Ahmed , Khimya Khetarpal

DOI:

关键词: ReproducibilityComputer scienceMachine learningReinforcement learningArtificial intelligence

摘要: Reinforcement learning (RL) has recently achieved tremendous success in solving complex tasks. Careful considerations are made towards reproducible research in machine learning …

参考文章(9)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition Proceedings of the IEEE. ,vol. 86, pp. 2278- 2324 ,(1998) , 10.1109/5.726791
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei, ImageNet Large Scale Visual Recognition Challenge International Journal of Computer Vision. ,vol. 115, pp. 211- 252 ,(2015) , 10.1007/S11263-015-0816-Y
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236
Martin Riedmiller, Jan Peters, Stefan Schaal, Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. pp. 254- 261 ,(2007) , 10.1109/ADPRL.2007.368196
M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling, The arcade learning environment: an evaluation platform for general agents Journal of Artificial Intelligence Research. ,vol. 47, pp. 253- 279 ,(2013) , 10.1613/JAIR.3912
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis, None, Mastering the game of Go with deep neural networks and tree search Nature. ,vol. 529, pp. 484- 489 ,(2016) , 10.1038/NATURE16961
Marwin H. S. Segler, Mike Preuss, Mark P. Waller, Planning chemical syntheses with deep neural networks and symbolic AI. Nature. ,vol. 555, pp. 604- 610 ,(2018) , 10.1038/NATURE25978