Reinforcement Learning Toolbox - When does algorithm train?

2 vues (au cours des 30 derniers jours)
I am currently using the RL-Toolbox with a DQN-Agent built into a long-running process-simulation.
The maximum stepcount is currently 8000 steps per episode.
Unfortunately the documentation seems a little ambiguous to me, so here my question:
Doese the train-function of the RL-Toolbox train the agent at the end of an episode or during the episode when the step count exeeds the minibatch-size (like in the baseline algorithms)?
Thank you in advance.

Réponse acceptée

Emmanouil Tzorakoleftherakis
The implementation is based on the algorithm listed here.
Weights are being updated at each time step.
  1 commentaire
Hans-Joachim Steinort
Hans-Joachim Steinort le 26 Sep 2019
"For each training time step" - that was the line I was looking for (yet looking into the source code lead me to the same conclusion).
After double-checking the baseline-algorithms I found that they do it the same way.
Thank you for your time!

Connectez-vous pour commenter.

Plus de réponses (0)

Produits


Version

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by