Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?
4 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hey,
I' trying to implement a TD3 Agent using MATLAB. But instead of using a replay buffer that randomly chooses samples to use in the mini batch, I would like to implememt a prioritized replay buffer instead. Until now, I couldn't find a agent option to do so.
I would be very grateful if somebody could help me with my problem.
Thanks in advance for the answers.
best regards
Michael
0 commentaires
Réponses (1)
Ahmed R. Sayed
le 30 Sep 2022
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agents uniformly sample data from this buffer. To perform nonuniform prioritized sampling [1], which can improve sample efficiency when training your agent, use an rlPrioritizedReplayMemory object. Please refere to rlprioritizedreplaymemory.
[1] Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. 'Prioritized experience replay'. arXiv:1511.05952 [Cs] 25 February 2016. https://arxiv.org/abs/1511.05952.
0 commentaires
Voir également
Catégories
En savoir plus sur Training and Simulation dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!