Approximate robust control of uncertain dynamical systems

Edouard Leurent , Yann Blanco , Denis Efimov , Odalric-Ambrym Maillard
arXiv preprint arXiv:1903.00220

26
2019
Mathematics of statistical sequential decision making

Odalric-Ambrym Maillard
Université de Lille, Sciences et Technologies

24
2019
APPRENTISSAGE SÉQUENTIEL: Bandits, Statistique et Renforcement.

Odalric-Ambrym Maillard
Université des Sciences et Technologie de Lille-Lille I

15
2011
Efficient change-point detection for tackling piecewise-stationary bandits

Lilian Besson , Emilie Kaufmann , Odalric-Ambrym Maillard , Julien Seznec
Journal of Machine Learning Research 23 ( 77) 1 -40

14
2022
Stochastic online linear regression: the forward algorithm to replace ridge

Reda Ouhamma , Odalric-Ambrym Maillard , Vianney Perchet
Advances in Neural Information Processing Systems 34 24430 -24441

13
2021
Distribution-dependent and time-uniform bounds for piecewise iid bandits

Subhojyoti Mukherjee , Odalric-Ambrym Maillard
arXiv preprint arXiv:1905.13159

11
2019
IMED-RL: Regret optimal learning of ergodic Markov decision processes

Fabien Pesquerel , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems 35 26363 -26374

10
2022
Efficient change-point detection for tackling piecewise-stationary bandits

Lilian Besson , Emilie Kaufmann , Odalric-Ambrym Maillard , Julien Seznec
arXiv preprint arXiv:1902.01575

9
2019
Bandits corrupted by nature: Lower bounds on regret and robust optimistic algorithm

Debabrota Basu , Odalric-Ambrym Maillard , Timothée Mathieu
arXiv preprint arXiv:2203.03186

7
2022
Stochastic bandits with groups of similar arms.

Fabien Pesquerel , Hassan Saber , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems 34 19461 -19472

6
2021
Forced-exploration free strategies for unimodal bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
arXiv preprint arXiv:2006.16569

6
2020
From optimality to robustness: Adaptive re-sampling strategies in stochastic bandits

Dorian Baudry , Patrick Saux , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems 34 14029 -14041

5
2021
Local Dvoretzky–Kiefer–Wolfowitz Confidence Bands

Odalric-Ambrym Maillard
Mathematical Methods of Statistics

4
2022
From optimality to robustness: Dirichlet sampling strategies in stochastic bandits

Dorian Baudry , Patrick Saux , Odalric-Ambrym Maillard
NeurIPS 2021-35th International Conference on Neural Information Processing Systems

4
2021
Online sign identification: Minimization of the number of errors in thresholding bandits

Reda Ouhamma , Rémy Degenne , Pierre Gaillard , Vianney Perchet
Advances in Neural Information Processing Systems 34 18577 -18589

3
2021
Indexed minimum empirical divergence for unimodal bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems 34 7346 -7356

3
2021
Routine bandits: Minimizing regret on recurring problems

Hassan Saber , Léo Saci , Odalric-Ambrym Maillard , Audrey Durand
Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part I 21 3 -18

3
2021
Optimal strategies for graph-structured bandits

Hassan Saber , Pierre Ménard , Odalric-Ambrym Maillard
arXiv preprint arXiv:2007.03224

3
2020
Robust estimation, prediction and control with linear dynamics and generic costs

Edouard Leurent , Denis Efimov , Odalric-Ambrym Maillard
Advances in Neural Information Processing Systems (NeurIPS)

3
2020
Latent Bandits., January 2014

Odalric-Ambrym Maillard , Shie Mannor
Extended version of the paper accepted to ICML

2
2014