Selecting reinforcement learning actions using goals and observations

Tom Schaul , Daniel George Horgan , Karol Gregor , David Silver

10
2020
Reinforcement learning using distributed prioritized replay

David Budden , Gabriel Barth-Maron , John Quan , Daniel George Horgan

7
2023
Unicorn: Continual learning with a universal, off-policy agent

Tom Schaul , Daniel J. Mankowitz , Matteo Hessel , Junhyuk Oh
arXiv: Learning

39
2018
Distributed Distributional Deterministic Policy Gradients

David Budden , Nicolas Heess , Dan Horgan , Matthew W. Hoffman
arXiv: Learning

451
2018
Observe and Look Further: Achieving Consistent Performance on Atari

Olivier Pietquin , Rémi Munos , David Budden , Mohammad Gheshlaghi Azar
arXiv: Learning

85
2018
Distributed Prioritized Experience Replay

David Budden , Matteo Hessel , Dan Horgan , John Quan
international conference on learning representations

640
2018
Deep Q-learning From Demonstrations.

Ian Osband , Tom Schaul , Olivier Pietquin , Gabriel Dulac-Arnold
national conference on artificial intelligence 3223 -3230

885
2018
Grandmaster level in StarCraft II using multi-agent reinforcement learning.

Oriol Vinyals , Igor Babuschkin , Wojciech M Czarnecki , Michaël Mathieu
Nature 575 ( 7782) 350 -354

2,756
2019
Alphastar: Mastering the real-time strategy game starcraft ii

Oriol Vinyals , Igor Babuschkin , Junyoung Chung , Michael Mathieu
DeepMind blog 2 20 -20

550
2019
Vision-Language Models as a Source of Rewards

Kate Baumli , Satinder Baveja , Feryal Behbahani , Harris Chan
ALOE 2023

3
2023
Towards Consistent Performance on Atari using Expert Demonstrations

Tobias Pohlen , Bilal Piot , Todd Hester , Mohammad Gheshlaghi Azar

Universal Value Function Approximators

Tom Schaul , Daniel Horgan , David Silver , Karol Gregor
international conference on machine learning 1312 -1320

936
2015
Rainbow: Combining Improvements in Deep Reinforcement Learning

Tom Schaul , Mohammad Gheshlaghi Azar , Georg Ostrovski , Bilal Piot
national conference on artificial intelligence 3215 -3222

1,878
2017
Gemini: a family of highly capable multimodal models

Gemini Team , Rohan Anil , Sebastian Borgeaud , Jean-Baptiste Alayrac
arXiv preprint arXiv:2312.11805

695
2023
Evaluating frontier models for dangerous capabilities

Mary Phuong , Matthew Aitchison , Elliot Catt , Sarah Cogan
arXiv preprint arXiv:2403.13793

3
2024
Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team , Morgane Riviere , Shreya Pathak , Pier Giuseppe Sessa
arXiv preprint arXiv:2408.00118

2024
Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah, Allan Dafoe, and Toby Shevlane

Mary Phuong , Matthew Aitchison , Elliot Catt , Sarah Cogan
Evaluating frontier models for dangerous capabilities

12
2024
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Gemini Team , Petko Georgiev , Ving Ian Lei , Ryan Burnell
arXiv preprint arXiv:2403.05530

39
2024