On combining decisions from multiple expert imitators for performance

作者: Ian Watson , Jonathan Rubin

DOI: 10.5591/978-1-57735-516-8/IJCAI11-067

关键词:

摘要: One approach for artificially intelligent agents wishing to maximise some performance metric in a given domain is learn from collection of training data that consists actions or decisions made by expert, an attempt imitate expert's style. We refer this type agent as expert imitator. In paper we investigate whether can be improved combining multiple imitators. particular, two existing approaches decisions. The first combines employing ensemble voting between second dynamically selects the best imitator use at runtime imitators current environment. these computer poker. create limit and no Texas Hold'em determine their using listed above.

参考文章(11)
Darse Billings, Morgan Kan, A Tool for the Direct Assessment of Poker Decisions ICGA Journal. ,vol. 29, pp. 119- 142 ,(2006) , 10.3233/ICG-2006-29302
Jonathan Rubin, Ian Watson, Similarity-Based retrieval and solution re-use policies in the game of texas hold'em international conference on case based reasoning. pp. 465- 479 ,(2010) , 10.1007/978-3-642-14274-1_34
Janet Kolodner, Case-based reasoning ,(1993)
Thomas G. Dietterich, Ensemble Methods in Machine Learning Multiple Classifier Systems. pp. 1- 15 ,(2000) , 10.1007/3-540-45014-9_1
Michael W. Floyd, Babak Esfandiari, An Active Approach to Automatic Case Generation international conference on case based reasoning. pp. 150- 164 ,(2009) , 10.1007/978-3-642-02998-1_12
Guy Van den Broeck, Kurt Driessens, Jan Ramon, Monte-Carlo Tree Search in Poker Using Expected Reward Distributions asian conference on machine learning. ,vol. 5828, pp. 367- 381 ,(2009) , 10.1007/978-3-642-05224-8_28
Adam Coates, Pieter Abbeel, Andrew Y. Ng, Learning for control from multiple demonstrations Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 144- 151 ,(2008) , 10.1145/1390156.1390175
Michael Johanson, Martin Zinkevich, Michael Bowling, None, Computing Robust Counter-Strategies neural information processing systems. ,vol. 20, pp. 721- 728 ,(2007) , 10.7939/R35N5V
Peter Auer, Nicolò Cesa-Bianchi, Paul Fischer, Finite-time Analysis of the Multiarmed Bandit Problem Machine Learning. ,vol. 47, pp. 235- 256 ,(2002) , 10.1023/A:1013689704352