Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi

作者: H. Jaap van den Herik , Jos Vermaseren , S. Ali Mirsoleimani , Aske Plaat

DOI:

关键词:

摘要: In 2013 Intel introduced the Xeon Phi, a new parallel co-processor board. The Phi is cache-coherent many-core shared memory architecture claiming CPU-like versatility, programmability, high performance, and power efficiency. first published micro-benchmark studies indicate that many of Intel's claims appear to be true. current paper study on complex artificial intelligence application. It contains an open source MCTS application for playing tournament quality Go (an oriental board game). We report speedup figures up 240 threads real machine, allowing direct comparison previous simulation studies. After substantial amount work, we observed performance scales well 32 threads, largely confirming results this program, although surprisingly deteriorates between threads. Furthermore, (1) unexpected anomalies CPU small problem sizes numbers (2) sensitive scheduling choices. Achieving good programs not straightforward; it requires deep understanding search patterns, scheduling, (3) its cores caches. practice, less straightforward program than originally envisioned by Intel.

参考文章(23)
S. Ali Mirsoleimani, Ali Karami, Farshad Khunjush, A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor automation, robotics and control systems. pp. 135- 146 ,(2014) , 10.1007/978-3-319-04891-8_12
Richard B. Segal, On the scalability of parallel UCT annual conference on computers. ,vol. 6515, pp. 36- 47 ,(2010) , 10.1007/978-3-642-17928-0_4
Amine Bourki, Guillaume Chaslot, Matthieu Coulm, Vincent Danjean, Hassen Doghmen, Jean-Baptiste Hoock, Thomas Hérault, Arpad Rimmel, Fabien Teytaud, Olivier Teytaud, Paul Vayssière, Ziqin Yu, Scalability and parallelization of Monte-Carlo tree search annual conference on computers. ,vol. 6515, pp. 48- 58 ,(2010) , 10.1007/978-3-642-17928-0_5
Markus Enzenberger, Martin Müller, A lock-free multithreaded monte-carlo tree search algorithm advances in computer games. ,vol. 6048, pp. 14- 20 ,(2009) , 10.1007/978-3-642-12993-3_2
Henri E. Bal, Jonathan Schaeffer, Aske Plat, John W. Romein, Transposition table driven work scheduling in distributed search national conference on artificial intelligence. pp. 725- 731 ,(1999)
Guillaume M. J. -B. Chaslot, Mark H. M. Winands, H. Jaap van den Herik, Parallel Monte-Carlo Tree Search Computers and Games. pp. 60- 71 ,(2008) , 10.1007/978-3-540-87608-3_6
Levente Kocsis, Csaba Szepesvári, Bandit Based Monte-Carlo Planning Lecture Notes in Computer Science. pp. 282- 293 ,(2006) , 10.1007/11871842_29
Rémi Coulom, Efficient selectivity and backup operators in Monte-Carlo tree search annual conference on computers. pp. 72- 83 ,(2006) , 10.1007/978-3-540-75538-8_7
Ali Karami, Sayyed Ali Mirsoleimani, Farshad Khunjush, A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs international symposium on computer architecture. pp. 15- 22 ,(2013) , 10.1109/CADS.2013.6714232