Automated action abstraction of imperfect information extensive-form games

作者: Robert Holte , John Hawkin , Duane Szafron

DOI:

关键词: Perfect informationDecision problemLimit (mathematics)Mathematical optimizationAction (philosophy)Extensive-form gameFocus (computing)Space (commercial competition)Computer scienceValue (ethics)Artificial intelligence

摘要: Multi-agent decision problems can often be formulated as extensive-form games. We focus on imperfect information games in which one or more actions at many points have an associated continuous many-valued parameter. A stock trading agent, addition to deciding whether buy not, must decide how much buy. In no-limit poker, selecting a probability for each action, the agent bet betting action. Selecting values these parameters makes extremely large. Two-player Texas Hold'em poker with stacks of 500 big blinds has approximately 1071 states, is than 1050 times states two-player limit Hold'em. The main contribution this paper technique that abstracts game's action space by one, small number, show strategies computed using new algorithm Leduc exhibit significant utility gains over e-Nash equilibrium standard, hand-crafted parameter value abstractions.

参考文章(17)
Vincent Corruble, Charles Madeira, Geber Ramalho, Designing a reinforcement learning-based adaptive AI for large-scale strategy games national conference on artificial intelligence. pp. 121- 123 ,(2006)
Kevin Waugh, Martin Zinkevich, Michael Johanson, Morgan Kan, David Schnizlein, Michael H Bowling, None, A Practical Use of Imperfect Recall symposium on abstraction, reformulation and approximation. ,(2009)
Robert Givan, Thomas Dean, Kee-Eung Kim, Solving stochastic planning problems with large state and action spaces international conference on artificial intelligence planning systems. pp. 102- 110 ,(1998)
Andrew Gilpin, Samid Hoda, Javier Peña, Tuomas Sandholm, Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games Lecture Notes in Computer Science. pp. 57- 69 ,(2007) , 10.1007/978-3-540-77105-0_9
Tuomas Sandholm, Andrew Gilpin, Troels Bjerre Sørensen, Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold'em poker national conference on artificial intelligence. pp. 50- 57 ,(2007)
S.H. Tijs, Stochastic games with one big action space in each state Research Papers in Economics. ,(1980)
Tuomas Sandholm, Andrew Gilpin, Troels Bjerre Sørensen, A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs adaptive agents and multi-agents systems. pp. 911- 918 ,(2008) , 10.5555/1402298.1402350
Andrew Gilpin, Tuomas Sandholm, Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker adaptive agents and multi-agents systems. pp. 192- ,(2007) , 10.1145/1329125.1329358
Nolan Bard, Kevin Waugh, Michael Bowling, Strategy Grafting in Extensive Games neural information processing systems. ,vol. 22, pp. 2026- 2034 ,(2009)
Hado van Hasselt, Marco A. Wiering, Reinforcement Learning in Continuous Action Spaces 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning. pp. 272- 279 ,(2007) , 10.1109/ADPRL.2007.368199