Reinforcement learning actions using DDPG
Afficher commentaires plus anciens
Greetings. I'm Jason and I'm working on controlling a bipedal using reinforcement learning. I need help to decide between the two methods below using DDPG:
1_ Generate random actions with Noise variance of %10 of my action range based on descriptions of the DDPG noise model
2_ Using a low variance like 0.5 as they have used in have used in MSRA biped and humanoid training with RL.
I really appreciate it if you could help me with this. And in the latter case, the actions are the output of a tanh layer with low variance([-1.5 1.5]), how is it converted into desired actions?
Please consider that I'm pretty sure that the range of actions I have calculated is good enough to solve the problem and also I tried using higher variances but it makes the learning process less stable. Any sugguestions on how I should generate the random actions?
Thanks in advance for your time and consideration
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Reinforcement Learning dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!