Apprentissage par Renforcement sans Modèle et avec Action Continue

Patrick Pilarski , Richard Sutton , Nicolas Degris
Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012

2012
Representing Knowledge in a Computational Constructivist Agent

T. Degris
Constructivist Foundations 9 ( 1) 63 -64

2013
Model-Free reinforcement learning with continuous action in practice

T. Degris , P. M. Pilarski , R. S. Sutton
advances in computing and communications 2177 -2182

254
2012
Rapid response of head direction cells to reorienting visual cues: a computational model

T. Degris , O. Sigaud , S.I. Wiener , A. Arleo
Neurocomputing 58 675 -682

8
2004
Adaptive artificial limbs: a real-time approach to prediction and anticipation

P. M. Pilarski , M. R. Dawson , T. Degris , J. P. Carey
IEEE Robotics & Automation Magazine 20 ( 1) 53 -64

39
2013
Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning

P. M. Pilarski , M. R. Dawson , T. Degris , F. Fahimi
ieee international conference on rehabilitation robotics 2011 1 -7

122
2011
Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

Doina Precup , Patrick M. Pilarski , Richard S. Sutton , Adam White
adaptive agents and multi-agents systems 761 -768

256
2011
Deterministic Policy Gradient Algorithms

Daan Wierstra , Martin Riedmiller , Guy Lever , Nicolas Heess
international conference on machine learning 387 -395

3,693
2014
The predictron: end-to-end learning and planning

Tom Schaul , Gabriel Dulac-Arnold , Arthur Guez , David Reichert
international conference on machine learning 3191 -3199

258
2017
Apprentissage par renforcement exploitant la structure additive des MDP factorisés

Olivier Sigaud , Pierre-Henri Wuillemin , Thomas Degris
JFPDA 2007 - 2e Journées Francophones Planification, Décision, Apprentissage pour la conduite de système 49 -60

2007
A Spiking Neuron Model of Head-Direction Cells for Robot Orientation

Loïc Lachèze , Christian Boucheny , Thomas Degris , Angelo Arleo
simulation of adaptive behavior 255 -263

18
2004
Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs

Thomas Degris , Olivier Sigaud , Pierre-Henri Wuillemin
Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle 23 221 -251

1
2009
Meta-Descent for Online, Continual Prediction

Andrew Jacobsen , Matthew Schlegel , Cameron Linke , Thomas Degris
national conference on artificial intelligence 33 ( 01) 3943 -3950

18
2019
Vector-based navigation using grid-like representations in artificial agents

Alexander Pritzel , Andrea Banino , Benigno Uria , Brian C Zhang
Nature 557 ( 7705) 429 -433

546
2018
Tuning-free step-size adaptation

Ashique Rupam Mahmood , Richard S. Sutton , Thomas Degris , Patrick M. Pilarski
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2121 -2124

60
2012
Factored Markov Decision Processes

Thomas Degris , Olivier Sigaud
Markov Decision Processes in Artificial Intelligence 99 -126

12
2013
Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots

Patrick M. Pilarski , Michael R. Dawson , Thomas Degris , Jason P. Carey
ieee international conference on biomedical robotics and biomechatronics 296 -302

36
2012
Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

Thomas Degris , Olivier Sigaud , Pierre-Henri Wuillemin
Proceedings of the 23rd international conference on Machine learning - ICML '06 257 -264

154
2006
Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study

Cam Linke , Nadia M. Ady , Martha White , Thomas Degris
Journal of Artificial Intelligence Research 69 1287 -1332

23
2020
Deep reinforcement learning in large discrete action spaces

Gabriel Dulac-Arnold , Richard Evans , Hado van Hasselt , Peter Sunehag
arXiv preprint arXiv:1512.07679

517
2015