Multigrid methods for zero-sum two player stochastic games

作者: Sylvie Detournay

DOI:

关键词: Multigrid methodApplied mathematicsStochastic gameBellman equationMathematical optimizationMathematicsTwo-player gameMarkov chainStochastic controlVariational inequalityDifferential game

摘要: In this thesis, we present some algorithms and numerical results for the solution of large scale zero-sum two player repeated stochastic games. particular, consider class games with perfect information infinite horizon. class, discounted payoff mean payoff. Our are mainly based on policy iteration type multigrid methods, implemented in C. These applied either to dynamic programming equation a true finite state space game or discretization an Isaacs PDE differential game. first part, propose algorithm which combines iterations algebraic methods solve linear systems involved iterations. We tests discretizations equations variational inequalities. also full multilevel iteration, similar FMG, allows one improve substantially computation time solving For payoff, develop action spaces, general multichain case (i.e. without irreducibility assumption Markov chains determined by strategies players), following idea Cochet-Terrasson Gaubert (2006). This relies notion nonlinear spectral projection operators (which monotone convex). show that sequence values relative satisfies lexicographical monotonicity property, implies does terminate. variant Richman pursuit-evasion new particular singular arise instance above Furthermore, introduce method find stationary probability irreducible chain using control approach Howard systems.

参考文章(160)
T. E. S. Raghavan, Finite-Step Algorithms for Single-Controller and Perfect Information Stochastic Games Stochastic Games and Applications. pp. 227- 251 ,(2003) , 10.1007/978-94-010-0189-2_15
Timothy A. Davis, Direct Methods for Sparse Linear Systems (Fundamentals of Algorithms 2) Society for Industrial and Applied Mathematics. ,(2006)
J. Van der Wal, Discounted Markov games: Generalized policy iteration method Journal of Optimization Theory and Applications. ,vol. 25, pp. 125- 138 ,(1978) , 10.1007/BF00933260
Emiliano Cristiani, Maurizio Falcone, Fully-Discrete Schemes for the Value Function of Pursuit-Evasion Games with State Constraints Advances in Dynamic Games and Their Applications. pp. 1- 30 ,(2009) , 10.1007/978-0-8176-4834-3_11
Alan B Williams, Kevin R Long, Michael Allen Heroux, Richard B Lehoucq, Andrew Gerhard Salinger, Tamara Gibson Kolda, Roger Patrick Pawlowski, Robert John Hoekstra, Roscoe Ainsworth Bartlett, Raymond Stephen Tuminaro, Victoria E Howle, Jonathan Joseph Hu, James M Willenbring, Eric Todd Phipps, Heidi K Thornquist, None, An overview of Trilinos. Sandia National Laboratories. ,(2003) , 10.2172/918383
Martino Bardi, Maurizio Falcone, Pierpaolo Soravia, Numerical Methods for Pursuit-Evasion Games via Viscosity Solutions Stochastic and Differential Games. pp. 105- 175 ,(1999) , 10.1007/978-1-4612-1592-9_3
M. Bardi, P. Soravia, M. Falcone, Fully Discrete Schemes for the Value Function of Pursuit-Evasion Games Advances in Dynamic Games and Applications. pp. 89- 105 ,(1994) , 10.1007/978-1-4612-0245-5_5
Marianne Akian, Analyse de l’algorithme multigrille FMGH de résolution d’équations d’Hamilton-Jacobi-Bellman Lecture Notes in Control and Information Sciences. ,vol. 144, pp. 113- 122 ,(1990) , 10.1007/BFB0120034