reinforcement learning and DDPG agent problem

Question

beni hadi le 18 Sep 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/595939-reinforcement-learning-and-ddpg-agent-problem

Commenté : beni hadi le 19 Sep 2020

Réponse acceptée : Emmanouil Tzorakoleftherakis

I used a deep reinforcement learning toolbox to path planning of a robot, including the DDPG algorithm. My scenario is that the robot starts from a random position and reaches the random goal location. After training, the result is a fixed path! And with changing the goal position, the path does not change. It is as if the network has learned only one path. The Drop-out layer is used in the network structure.

Does anyone have any idea what went wrong?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Emmanouil Tzorakoleftherakis le 18 Sep 2020

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/595939-reinforcement-learning-and-ddpg-agent-problem#answer_496897

Looks like training was not successful. There could be many things at fault here - some suggestions:

1) Make sure you are randomizing the target locations at the beginning of each episode. It would help if you add visualization to actually verify targets move/debug the agent's behavior during training

2) The agent may not have enough information available to make decisions. Make sure the observations provide enough info to the agent

3) What does the episode manager plot look like when training stops? You may need to train the agent for more time

4) Why are you using a dropout layer? Unless your observations are images, this layer islikely not required (at least I don't think I have seen it in any shipping examples in Reinforcement Learning Toolbox). So your neural network architecture may also have something to do with this behavior.

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

beni hadi le 19 Sep 2020

Thank you for your suggestions.

I defined an area and if the robot leaves this area, the episode will stop and also when it reaches a suitable distance from the target. At the beginning of each episode, the positions of the robot and the target are reset. (These locations are selected in a small region randomly).

Connectez-vous pour commenter.

reinforcement learning and DDPG agent problem

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

reinforcement learning and DDPG agent problem

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens