Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

Scott Niekum , Daniel S. Brown , Wonjoon Goo , Prabhat Nagarajan
arXiv: Learning

216
2019
Learning Latent State Spaces for Planning through Reward Prediction

Yi Ouyang , Yasuhiro Fujita , Aaron J. Havens , Prabhat Nagarajan
arXiv: Learning

4
2019
ChainerRL: A Deep Reinforcement Learning Library

Toshiki Kataoka , Yasuhiro Fujita , Prabhat Nagarajan , Takahiro Ishikawa
arXiv: Learning

92
2019
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning.

Zhang-Wei Hong , Prabhat Nagarajan , Guilherme Maeda
arXiv: Learning

3
2020
Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators

Yasuhiro Fujita , Prabhat Nagarajan , Kota Uenishi , Shimpei Masuda
intelligent robots and systems 9712 -9719

14
2020
Deterministic implementations for reproducibility in deep reinforcement learning

Prabhat Nagarajan , Garrett Warnell , Peter Stone
arXiv preprint arXiv:1809.05676

58
2018
The impact of nondeterminism on reproducibility in deep reinforcement learning

Prabhat Nagarajan , Garrett Warnell , Peter Stone
International Conference on Machine Learning Workshop on Reproduciblity in Machine Learning

32
2018
Reconnaissance for Reinforcement Learning with Safety Constraints

Shin-ichi Maeda , Hayato Watahiki , Yi Ouyang , Shintarou Okada
Joint European Conference on Machine Learning and Knowledge Discovery in Databases 567 -582

2
2021
When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

Vincent Liu , Prabhat Nagarajan , Andrew Patterson , Martha White
arXiv preprint arXiv:2312.02355

2023
Swarm-inspired Reinforcement Learning via Collaborative Inter-agent Knowledge Distillation

Zhang-Wei Hong , Prabhat Nagarajan , Guilherme Maeda
Workshop on Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019),

2019