Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

作者: Gabriele Farina , Christian Kroer , Tuomas Sandholm

DOI:

关键词:

摘要: We study the performance of optimistic regret-minimization algorithms for both minimizing regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to apply these games, a distance-generating function is needed. use dilated entropy Euclidean distance functions. For we prove first explicit bounds on strong-convexity parameter general treeplexes. Furthermore, show that functions enable us decompose mirror descent algorithm, its variant, into local at each information set. This decomposition mirrors structure counterfactual minimization framework, enables important techniques in practice, such as distributed updates pruning cold parts game tree. Our provably converge rate $T^{-1}$, which superior prior algorithms. experimentally compare popular algorithm CFR+, has theoretical convergence $T^{-0.5}$ theory, but known often or better, practice. give an example matrix where CFR+ converges relatively slow $T^{-0.74}$, whereas our methods faster than $T^{-1}$. go fast also holds Kuhn poker game, game. games with deeper trees however, find still faster. Finally when goal regret, rather equilibrium, can outperform even deep trees.

参考文章(39)
Alexander Rakhlin, Karthik Sridharan, Online Learning With Predictable Sequences conference on learning theory. ,vol. 30, pp. 993- 1019 ,(2013)
Samid Hoda, Andrew Gilpin, Javier Peña, Tuomas Sandholm, Smoothing Techniques for Computing Nash Equilibria of Sequential Games Mathematics of Operations Research. ,vol. 35, pp. 494- 512 ,(2010) , 10.1287/MOOR.1100.0452
M. Bowling, N. Burch, M. Johanson, O. Tammelin, Heads-up limit hold’em poker is solved Science. ,vol. 347, pp. 145- 149 ,(2015) , 10.1126/SCIENCE.1259433
Christian Kroer, Tuomas Sandholm, Extensive-form game abstraction with bounds economics and computation. pp. 621- 638 ,(2014) , 10.1145/2600057.2602905
Shai Shalev-Shwartz, Yoram Singer, A primal-dual perspective of online learning algorithms Machine Learning. ,vol. 69, pp. 115- 142 ,(2007) , 10.1007/S10994-007-5014-X
Andrew Gilpin, Tuomas Sandholm, Lossless abstraction of imperfect information games Journal of the ACM. ,vol. 54, pp. 25- ,(2007) , 10.1145/1284320.1284324
Antonin Chambolle, Thomas Pock, A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging Journal of Mathematical Imaging and Vision. ,vol. 40, pp. 120- 145 ,(2011) , 10.1007/S10851-010-0251-1
Carmelo Piccione, Martin Zinkevich, Michael Johanson, Michael Bowling, Regret Minimization in Games with Incomplete Information neural information processing systems. ,vol. 20, pp. 1729- 1736 ,(2007) , 10.7939/R3Q23R282
Neil Burch, Marc Lanctot, Michael Bowling, Richard G. Gibson, Martin Zinkevich, No-Regret Learning in Extensive-Form Games with Imperfect Recall international conference on machine learning. pp. 1035- 1042 ,(2012)