Why is the DDPG episode rewards never change during the whole training process?

Question

Guoge Tan le 25 Mai 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/532933-why-is-the-ddpg-episode-rewards-never-change-during-the-whole-training-process

Commenté : Shahriar le 29 Juin 2022

Réponse acceptée : Emmanouil Tzorakoleftherakis

I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.

I also saw at here that others have a similar problem. So any advice for my problem? Thank you.

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Shahriar le 29 Juin 2022

@Guoge Tan could you solve this issue? I have a similar situation.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Emmanouil Tzorakoleftherakis le 26 Mai 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/532933-why-is-the-ddpg-episode-rewards-never-change-during-the-whole-training-process#answer_439593

Looks like the scale between Q0 and episode reward is very different. Try unchecking "Show Episode Q0" to see of the episode reward changes. I would then simplify the critic network to make sure it outputs values in a similar scale as the episode reward.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Why is the DDPG episode rewards never change during the whole training process?

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

Why is the DDPG episode rewards never change during the whole training process?

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens