Adversarial Policy Gradient for Alternating Markov Games

作者: Ryan Hayward , Chao Gao , Martin Mueller

DOI:

关键词:

摘要: Policy gradient reinforcement learning has been applied to two-player alternate-turn zero-sum games, eg, in AlphaGo, self-play REINFORCE was used to improve the neural net …

参考文章(0)