Boosted Fitted Q-Iteration.

Matteo Pirotta , Marcello Restelli , Carlo D’Eramo , Samuele Tosatto
international conference on machine learning 3434 -3443

39
2017
An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions

Jan Peters , Riad Akrour , Samuele Tosatto
arXiv: Machine Learning

5
2020
A Nonparametric Off-Policy Policy Gradient

Jan Peters , Hany Abdulsamad , Samuele Tosatto , Joao Carvalho
arXiv: Learning

9
2020
Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient.

Jan Peters , Samuele Tosatto , João Carvalho
arXiv: Learning

2
2020
Learning inverse dynamics models in O(n) time with LSTM networks

Elmar Rueckert , Moritz Nakatenus , Samuele Tosatto , Jan Peters
2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids) 811 -816

57
2017
Exploration Driven by an Optimistic Bellman Equation

Samuele Tosatto , Carlo D'Eramo , Joni Pajarinen , Marcello Restelli
international joint conference on neural network 1 -8

2
2019
Balloon Estimators for Improving and Scaling the Nonparametric Off-Policy Policy Gradient

Fabio d’Aquino Hilt , JanNiklas Kolf , Christian Weiland , Joao Carvalho
Smpte Journal

Making Policy Gradient Estimators for Softmax Policies More Robust to Non-stationarities

Shivam Garg , Samuele Tosatto , Yangchen Pan , Martha White
Smpte Journal

An alternate policy gradient estimator for softmax policies

Shivam Garg , Samuele Tosatto , Yangchen Pan , Martha White
arXiv preprint arXiv:2112.11622

1
2021
Contextual latent-movements off-policy optimization for robotic manipulation skills

Samuele Tosatto , Georgia Chalvatzaki , Jan Peters
2021 IEEE International Conference on Robotics and Automation (ICRA) 10815 -10821

14
2021
Dimensionality Reduction of Movement Primitives in Parameter Space

Samuele Tosatto , Jonas Stadtmüller , Jan Peters
arXiv preprint arXiv:2003.02634

2020
Deep probabilistic movement primitives with a bayesian aggregator

Michael Przystupa , Faezeh Haghverd , Martin Jagersand , Samuele Tosatto
2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 3704 -3711

3
2023
Dynamic decision frequency with continuous options

Amirmohammad Karimi , Jun Jin , Jun Luo , A Rupam Mahmood
2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 7545 -7552

1
2023
Model-free Policy Learning with Reward Gradients.

Qingfeng Lan , Samuele Tosatto , Homayoon Farrahi , A Rupam Mahmood
arXiv: Learning

4
2021
A temporal-difference approach to policy gradient estimation

Samuele Tosatto , Andrew Patterson , Martha White , A Rupam Mahmood
(ICML) The 39th International Conference on Machine Learning

3
2022
A Gradient Critic for Policy Gradient Estimation

Samuele Tosatto , Andrew Patterson , Martha White , A Rupam Mahmood
Sixteenth European Workshop on Reinforcement Learning

1
2023
Variable-Decision Frequency Option Critic.

Amirmohammad Karimi , Jun Jin , Jun Luo , A Rupam Mahmood
CoRR

2022
Technical Report:“Exploration Driven by an Optimistic Bellman Equation”

Samuele Tosatto , Carlo D’Eramo , Joni Pajarinen , Marcello Restelli

2018
Pink Noise LQR: How does Colored Noise affect the Optimal Policy in RL?

Jakob Hollenstein , Marko Zaric , Samuele Tosatto , Justus Piater
ICML 2024 Workshop: Foundations of Reinforcement Learning and Control--Connections and Perspectives