Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

作者: Hironobu Fujiyoshi , Tsubasa Hirakawa , Komei Sugiura , Takayoshi Yamashita , Hidenori Itaya

DOI:

关键词: Action (philosophy)Robot controlArtificial intelligenceValue (ethics)Reinforcement learningState (computer science)Computer scienceFeature (machine learning)Focus (computing)

摘要: Deep reinforcement learning (DRL) has great potential for acquiring the optimal action in complex environments such as games and robot control. However, it is difficult to analyze …

参考文章(29)
Dit-Yan Yeung, Hao Wang, Xingjian Shi, Zhourong Chen, Wang-chun Woo, Wai-kin Wong, Convolutional LSTM Network: a machine learning approach for precipitation nowcasting neural information processing systems. ,vol. 28, pp. 802- 810 ,(2015)
Arun Nair, Charles Beattie, Alessandro De Maria, Rory Fearon, Cagdas Alcicek, Vedavyas Panneershelvam, David Silver, Stig Petersen, Mustafa Suleyman, Sam Blackwell, Praveen Srinivasan, Volodymyr Mnih, Koray Kavukcuoglu, Shane Legg, Massively Parallel Methods for Deep Reinforcement Learning arXiv: Learning. ,(2015)
John Schulman, None, Trust Region Policy Optimization international conference on machine learning. pp. 1889- 1897 ,(2015)
Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement learning in robotics: A survey The International Journal of Robotics Research. ,vol. 32, pp. 1238- 1274 ,(2013) , 10.1177/0278364913495721
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis, None, Human-level control through deep reinforcement learning Nature. ,vol. 518, pp. 529- 533 ,(2015) , 10.1038/NATURE14236
Arthur Guez, David Silver, Hado van Hasselt, Deep reinforcement learning with double Q-Learning national conference on artificial intelligence. pp. 2094- 2100 ,(2016)
John N. Tsitsiklis, Vijay R. Konda, Actor-critic algorithms ,(2002)
Tom Schaul, Marc Lanctot, Nando De Freitas, Ziyu Wang, Matteo Hessel, Hado Van Hasselt, Dueling network architectures for deep reinforcement learning international conference on machine learning. pp. 1995- 2003 ,(2016)
Anastasiia Ignateva, Aleksandr Fedorov, Alexey Seleznev, Mikhail Pavlov, Ivan Sorokin, Deep Attention Recurrent Q-Network. arXiv: Learning. ,(2015)
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis, None, Mastering the game of Go with deep neural networks and tree search Nature. ,vol. 529, pp. 484- 489 ,(2016) , 10.1038/NATURE16961