Show me the way! bilevel search for synthesizing programmatic strategies

作者: David S Aleixo , Levi HS Lelis

DOI:

关键词:

摘要: The synthesis of programmatic strategies requires one to search in large non-differentiable spaces of computer programs. Current search algorithms use self-play approaches to guide this search. The issue with these approaches is that the guiding function often provides a weak search signal. This is because self-play functions only measure how well a program performs against other programs. Thus, while small changes to a losing program might not transform it into a winning one, such changes might represent steps in the direction of a winning program. In this paper we introduce a bilevel search algorithm that searches concurrently in the space of programs and in a space of state features. Each iteration of the search in the space of features defines a set of target features that the search in the program space attempts to achieve (ie, features one observes while following the strategy encoded in a program). We hypothesize the combination of a self-play function and a feature-based one provides a stronger search signal for synthesis. While both functions are used to guide the search in the program space, the self-play function is used to guide the search in the feature space, to allow for the selection of target features that are more likely to lead to winning programs. We evaluated our bilevel algorithm in MicroRTS, a real-time strategy game. Our results show that the bilevel search synthesizes stronger strategies than methods that search only in the program space. Also, the strategies our method synthesizes obtained the highest winning rate in a simulated tournament with several baseline agents, including the best agents from the two latest …

参考文章(0)