Control of memory, active perception, and action in minecraft

作者: Satinder Singh , Honglak Lee , Junhyuk Oh , Valliappa Chockalingam

DOI:

关键词:

摘要: In this paper, we introduce a new set of reinforcement learning (RL) tasks in Minecraft (a flexible 3D world). We then use these to systematically compare and contrast existing deep (DRL) architectures with our memory-based DRL architectures. These are designed emphasize, controllable manner, issues that pose challenges for RL methods including partial observability (due first-person visual observations), delayed rewards, high-dimensional observations, the need active perception correct manner so as perform well tasks. While conceptually simple describe, by virtue having all simultaneously they difficult current Additionally, evaluate generalization performance on environments not used during training. The experimental results show generalize unseen better than

参考文章(38)
Tom Schaul, Daniel Horgan, David Silver, Karol Gregor, Universal Value Function Approximators international conference on machine learning. pp. 1312- 1320 ,(2015)
Ronan Collobert, Clément Farabet, Koray Kavukcuoglu, Torch7: A Matlab-like Environment for Machine Learning neural information processing systems. ,(2011)
Pieter Abbeel, Sergey Levine, Bradly C. Stadie, Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models arXiv: Artificial Intelligence. ,(2015)
Levente Kocsis, Csaba Szepesvári, Bandit Based Monte-Carlo Planning Lecture Notes in Computer Science. pp. 282- 293 ,(2006) , 10.1007/11871842_29
Geoffrey E. Hinton, Vinod Nair, Rectified Linear Units Improve Restricted Boltzmann Machines international conference on machine learning. pp. 807- 814 ,(2010)
Tomas Mikolov, Armand Joulin, Inferring algorithmic patterns with stack-augmented recurrent nets neural information processing systems. ,vol. 28, pp. 190- 198 ,(2015)
John Schulman, None, Trust Region Policy Optimization international conference on machine learning. pp. 1889- 1897 ,(2015)
Arthur Szlam, Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, End-to-end memory networks neural information processing systems. ,vol. 28, pp. 2440- 2448 ,(2015)
Alex Graves, Generating Sequences With Recurrent Neural Networks arXiv: Neural and Evolutionary Computing. ,(2013)
Karthik Narasimhan, Tejas Kulkarni, Regina Barzilay, Language Understanding for Text-based Games using Deep Reinforcement Learning empirical methods in natural language processing. pp. 1- 11 ,(2015) , 10.18653/V1/D15-1001