RL DDPG Actions have high oscillation

8 vues (au cours des 30 derniers jours)

Ahmad Al Ali le 8 Nov 2023

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2044812-rl-ddpg-actions-have-high-oscillation

Commenté : Sourabh le 15 Déc 2023

Hello, I am using the DDPG Reinforcement learning toolbox in matlab to train a 3DOF robotic arm to move. the actions are joint torques, and although the actions reach the target, they are highly oscillating and noisy.

Can anyone help explane where this comes from ? ie: the algorithm itself, noise options ....

I am using the walking robot example to build noise options:

%% DDPG Agent Options
agentOptions = rlDDPGAgentOptions;
agentOptions.SampleTime = 0.025;
agentOptions.DiscountFactor = 0.99;
agentOptions.MiniBatchSize = 128;
agentOptions.ExperienceBufferLength = 5e5;
agentOptions.TargetSmoothFactor = 1e-3;
agentOptions.NoiseOptions.MeanAttractionConstant = 0.5;
agentOptions.NoiseOptions.Variance = 0.3;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;

i think it might have something to do with MeanattractionConstant, varience, or varience decay. (by the way, the joint limits are between -3,3).

the actions i get look like this :

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Réponses (1)

Emmanouil Tzorakoleftherakis le 9 Nov 2023

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2044812-rl-ddpg-actions-have-high-oscillation#answer_1349950

Hi,

The noise options you are mentioning are only used during training and are essential for exploration. If the plots you are showing above are from training, you may consider reducing the noise variance a bit.

If the plots you are showing are from the trained agent, you can consider penalizing large action changes in your reward signal. That would help reduce the oscillatory content.

Hope this helps

8 commentaires
Afficher 6 commentaires plus anciensMasquer 6 commentaires plus anciens

Ahmad Al Ali le 14 Déc 2023

@Sourabh I use a Rate Transition block in simulink, before inputting in the obsercations to the agent:

Sourabh le 15 Déc 2023

Actually i have a signal and i want to sample that signal at interval of 4 sec to make a array and then feed that array to my observation. Can i do it using rate transition block

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Catégories

AI and Statistics Deep Learning Toolbox Applications Autonomous and Control Systems Reinforcement Learning

En savoir plus sur Reinforcement Learning dans Help Center et File Exchange

Produits

Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by