Change Point Detection and Meta-Bandits for Online Learning in Dynamic Environments

作者: Olivier Teytaud , Nicolas Baskiotis , Sylvain Gelly , Cédric Hartland , Michèle Sebag

DOI:

关键词:

摘要: Motivated by realtime website optimization, this paper is about online learning in abruptly changing environments. Two extensions of the UCBT algorithm are combined order to handle dynamic multi-armed bandits, and specifically cope with fast variations rewards. Firstly, a change point detection test based on Page-Hinkley statistics used overcome limitations due inertia. Secondly, controlled forgetting strategy dubbed Meta-Bandit proposed take care Exploration vs Exploitation trade-off when PH triggered. Extensive empirical validation shows significant improvements compared baseline algorithms. The also investigates sensitivity respect number available options.

参考文章(17)
Nicolo Cesa-Bianchi, Gabor Lugosi, Prediction, learning, and games ,(2006)
David V. Hinkley, Inference in Two-Phase Regression Journal of the American Statistical Association. ,vol. 66, pp. 736- 743 ,(1971) , 10.1080/01621459.1971.10482337
G. Lorden, PROCEDURES FOR REACTING TO A CHANGE IN DISTRIBUTION Annals of Mathematical Statistics. ,vol. 42, pp. 1897- 1908 ,(1971) , 10.1214/AOMS/1177693055
Michèle Basseville, Detecting changes in signals and systems—a survey Automatica. ,vol. 24, pp. 309- 326 ,(1988) , 10.1016/0005-1098(88)90073-8
George V. Moustakides, Optimal stopping times for detecting changes in distributions Annals of Statistics. ,vol. 14, pp. 1379- 1387 ,(1986) , 10.1214/AOS/1176350164
E. S. PAGE, CONTINUOUS INSPECTION SCHEMES Biometrika. ,vol. 41, pp. 100- 115 ,(1954) , 10.1093/BIOMET/41.1-2.100
David V. Hinkley, Inference about the change-point in a sequence of binomial variables Biometrika. ,vol. 57, pp. 1- 17 ,(1970) , 10.1093/BIOMET/57.1.1
D. V. HINKLEY, Inference about the change-point from cumulative sum tests Biometrika. ,vol. 58, pp. 509- 523 ,(1971) , 10.1093/BIOMET/58.3.509
Graham Cormode, S. Muthukrishnan, An improved data stream summary: the count-min sketch and its applications Journal of Algorithms. ,vol. 55, pp. 58- 75 ,(2005) , 10.1016/J.JALGOR.2003.12.001