Counterfactual reasoning and learning systems: the example of computational advertising

作者: Elon Portugaly , Léon Bottou , D. Max Chickering , Denis X. Charles , Dipankar Ray

DOI:

关键词:

摘要: This work shows how to leverage causal inference understand the behavior of complex learning systems interacting with their environment and predict consequences changes system. Such predictions allow both humans algorithms select that would have improved system performance. is illustrated by experiments on ad placement associated Bing search engine.

参考文章(59)
Damien Ernst, Arthur Louette, Introduction to Reinforcement Learning MIT Press. ,(1998)
Joannès Vermorel, Mehryar Mohri, Multi-armed Bandit Algorithms and Empirical Evaluation Machine Learning: ECML 2005. pp. 437- 448 ,(2005) , 10.1007/11564096_42
Clark N. Glymour, Peter Spirtes, Richard Scheines, Causation, prediction, and search ,(1993)
Vladimir Naumovich Vapnik, Estimation of Dependences Based on Empirical Data ,(2010)
Doina Precup, Volodymyr Kuleshov, Algorithms for multi-armed bandit problems. arXiv: Artificial Intelligence. ,(2014)
J.C. Gittens, Bandit processes and dynamic allocation indices Research Papers in Economics. ,(2010)
Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári, Tuning Bandit Algorithms in Stochastic Environments Lecture Notes in Computer Science. pp. 150- 165 ,(2007) , 10.1007/978-3-540-75225-7_15
Denver Dash, Marek Druzdzel, Caveats for Causal Reasoning with Equilibrium Models european conference on symbolic and quantitative approaches to reasoning and uncertainty. pp. 192- 203 ,(2001) , 10.1007/3-540-44652-4_18
Massimiliano Pontil, Andreas Maurer, Empirical Bernstein Bounds and Sample Variance Penalization conference on learning theory. ,(2009)