作者: Gabriele Farina , Christian Kroer , Tuomas Sandholm
DOI:
关键词:
摘要: We study the performance of optimistic regret-minimization algorithms for both minimizing regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to apply these games, a distance-generating function is needed. use dilated entropy Euclidean distance functions. For we prove first explicit bounds on strong-convexity parameter general treeplexes. Furthermore, show that functions enable us decompose mirror descent algorithm, its variant, into local at each information set. This decomposition mirrors structure counterfactual minimization framework, enables important techniques in practice, such as distributed updates pruning cold parts game tree. Our provably converge rate $T^{-1}$, which superior prior algorithms. experimentally compare popular algorithm CFR+, has theoretical convergence $T^{-0.5}$ theory, but known often or better, practice. give an example matrix where CFR+ converges relatively slow $T^{-0.74}$, whereas our methods faster than $T^{-1}$. go fast also holds Kuhn poker game, game. games with deeper trees however, find still faster. Finally when goal regret, rather equilibrium, can outperform even deep trees.