Thomas Degris

机构: DeepMind

每年引用次数

引用次数

引用: 9,061

H-指数: 18

I10-指数 : 20

出版物: 29

标题

引用次数

年份

Apprentissage par Renforcement sans Modèle et avec Action Continue

Patrick Pilarski , Richard Sutton , Nicolas Degris
Journées Francophones sur la planification, la décision et l'apprentissage pour le contrôle des systèmes - JFPDA 2012

2012

Representing Knowledge in a Computational Constructivist Agent

T. Degris
Constructivist Foundations 9 ( 1) 63 -64

2013

Model-Free reinforcement learning with continuous action in practice

T. Degris , P. M. Pilarski , R. S. Sutton
advances in computing and communications 2177 -2182

254

2012

Rapid response of head direction cells to reorienting visual cues: a computational model

T. Degris , O. Sigaud , S.I. Wiener , A. Arleo
Neurocomputing 58 675 -682

2004

Adaptive artificial limbs: a real-time approach to prediction and anticipation

P. M. Pilarski , M. R. Dawson , T. Degris , J. P. Carey
IEEE Robotics & Automation Magazine 20 ( 1) 53 -64

2013

Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning

P. M. Pilarski , M. R. Dawson , T. Degris , F. Fahimi
ieee international conference on rehabilitation robotics 2011 1 -7

122

2011

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

Doina Precup , Patrick M. Pilarski , Richard S. Sutton , Adam White
adaptive agents and multi-agents systems 761 -768

256

2011

Deterministic Policy Gradient Algorithms

Daan Wierstra , Martin Riedmiller , Guy Lever , Nicolas Heess
international conference on machine learning 387 -395

3,693

2014

The predictron: end-to-end learning and planning

Tom Schaul , Gabriel Dulac-Arnold , Arthur Guez , David Reichert
international conference on machine learning 3191 -3199

258

2017

Apprentissage par renforcement exploitant la structure additive des MDP factorisés

Olivier Sigaud , Pierre-Henri Wuillemin , Thomas Degris
JFPDA 2007 - 2e Journées Francophones Planification, Décision, Apprentissage pour la conduite de système 49 -60

2007

A Spiking Neuron Model of Head-Direction Cells for Robot Orientation

Loïc Lachèze , Christian Boucheny , Thomas Degris , Angelo Arleo
simulation of adaptive behavior 255 -263

2004

Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs

Thomas Degris , Olivier Sigaud , Pierre-Henri Wuillemin
Revue des Sciences et Technologies de l'Information - Série RIA : Revue d'Intelligence Artificielle 23 221 -251

2009

Meta-Descent for Online, Continual Prediction

Andrew Jacobsen , Matthew Schlegel , Cameron Linke , Thomas Degris
national conference on artificial intelligence 33 ( 01) 3943 -3950

2019

Vector-based navigation using grid-like representations in artificial agents

Alexander Pritzel , Andrea Banino , Benigno Uria , Brian C Zhang
Nature 557 ( 7705) 429 -433

546

2018

Tuning-free step-size adaptation

Ashique Rupam Mahmood , Richard S. Sutton , Thomas Degris , Patrick M. Pilarski
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2121 -2124

2012

Factored Markov Decision Processes

Thomas Degris , Olivier Sigaud
Markov Decision Processes in Artificial Intelligence 99 -126

2013

Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots

Patrick M. Pilarski , Michael R. Dawson , Thomas Degris , Jason P. Carey
ieee international conference on biomedical robotics and biomechatronics 296 -302

2012

Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

Thomas Degris , Olivier Sigaud , Pierre-Henri Wuillemin
Proceedings of the 23rd international conference on Machine learning - ICML '06 257 -264

154

2006

Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study

Cam Linke , Nadia M. Ady , Martha White , Thomas Degris
Journal of Artificial Intelligence Research 69 1287 -1332

2020

Deep reinforcement learning in large discrete action spaces

Gabriel Dulac-Arnold , Richard Evans , Hado van Hasselt , Peter Sunehag
arXiv preprint arXiv:1512.07679

517

2015