Effacer les filtres
Effacer les filtres

Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?

4 vues (au cours des 30 derniers jours)
Michael Müller
Michael Müller le 18 Juin 2021
Hey,
I' trying to implement a TD3 Agent using MATLAB. But instead of using a replay buffer that randomly chooses samples to use in the mini batch, I would like to implememt a prioritized replay buffer instead. Until now, I couldn't find a agent option to do so.
I would be very grateful if somebody could help me with my problem.
Thanks in advance for the answers.
best regards
Michael

Réponses (1)

Ahmed R. Sayed
Ahmed R. Sayed le 30 Sep 2022
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agents uniformly sample data from this buffer. To perform nonuniform prioritized sampling [1], which can improve sample efficiency when training your agent, use an rlPrioritizedReplayMemory object. Please refere to rlprioritizedreplaymemory.
[1] Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. 'Prioritized experience replay'. arXiv:1511.05952 [Cs] 25 February 2016. https://arxiv.org/abs/1511.05952.

Catégories

En savoir plus sur Training and Simulation dans Help Center et File Exchange

Produits


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by