Deep Reinforcement Learning for Green Security Games with Real-Time Information

作者: Lucas Joppa , Fei Fang , Lantao Yu , Yi Wu , Zheyuan Ryan Shi

DOI:

关键词:

摘要: Green Security Games (GSGs) have been proposed and applied to optimize patrols conducted by law enforcement agencies in green security domains such as combating poaching, illegal logging overfishing. However, real-time information footprints agents' subsequent actions upon receiving the information, e.g., rangers following chase poacher, neglected previous work. To fill gap, we first propose a new game model GSG-I which augments GSGs with sequential movement vital element of information. Second, design novel deep reinforcement learning-based algorithm, DeDOL, compute patrolling strategy that adapts against best-responding attacker. DeDOL is built double oracle framework policy-space response oracle, solving restricted iteratively adding best strategies it through training Q-networks. Exploring structure, uses domain-specific heuristic initial constructs several local modes for efficient parallelized training. our knowledge, this attempt use Deep Q-Learning games.

参考文章(33)
Chao Zhang, Albert Xin Jiang, Martin B Short, P Jeffrey Brantingham, Milind Tambe, None, Defending Against Opportunistic Criminals: New Game-Theoretic Frameworks and Algorithms decision and game theory for security. pp. 3- 22 ,(2014) , 10.1007/978-3-319-12601-2_1
Francesco Amigoni, Nicola Basilico, Nicola Gatti, Leader-follower strategies for robotic patrolling in environments with arbitrary topologies adaptive agents and multi agents systems. ,vol. 1, pp. 57- 64 ,(2009)
Vincent Conitzer, Milind Tambe, Manish Jain, Security scheduling for real-world networks adaptive agents and multi-agents systems. pp. 215- 222 ,(2013) , 10.5555/2484920.2484957
Joshua Letchford, Vincent Conitzer, Computing optimal strategies to commit to in extensive-form games Proceedings of the 11th ACM conference on Electronic commerce - EC '10. pp. 83- 92 ,(2010) , 10.1145/1807342.1807354
Christopher Portway, Janusz Marecki, Praveen Paruchuri, Craig Western, Sarit Kraus, Fernando Ordóñez, James Pita, Milind Tambe, Manish Jain, Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport adaptive agents and multi-agents systems. pp. 125- 132 ,(2008) , 10.5555/1402795.1402819
H Brendan McMahan, Geoffrey J Gordon, Avrim Blum, None, Planning in the presence of cost functions controlled by an adversary international conference on machine learning. pp. 536- 543 ,(2003)
Branislav Bosansky, Christopher Kiekintveld, Viliam Lisy, Jiri Cermak, Michal Pechoucek, None, Double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games adaptive agents and multi-agents systems. pp. 335- 342 ,(2013) , 10.5555/2484920.2484975
Satinder Singh, Honglak Lee, Junhyuk Oh, Richard Lewis, Xiaoxiao Guo, Action-Conditional Video Prediction using Deep Networks in Atari Games arXiv: Learning. ,(2015)
Albert Xin Jiang, Milind Tambe, Fei Fang, Optimal patrol strategy for protecting moving targets with multiple mobile resources adaptive agents and multi-agents systems. pp. 957- 964 ,(2013) , 10.5555/2484920.2485072