How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

Question

houssam deboucha le 28 Août 2024

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are

Modifié(e) : praguna manvi le 4 Sep 2024

I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

praguna manvi le 4 Sep 2024

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are#answer_1510569

Modifié(e) : praguna manvi le 4 Sep 2024

Ouvrir dans MATLAB Online

Hi @houssam deboucha,

The critic and actor networks are updated internally using the “train” function for agents defined as:

agent = rlSACAgent(actor,[critic1,critic2],agentOpts);

You can find an example of training a rlSACAgent in this documentation:

https://www.mathworks.com/help/reinforcement-learning/ug/train-sac-agent-for-ball-balance-control.html#TrainSACAgentForBallBalanceControlExample-2

For custom training you can refer to this documentation, which outlines the functions needed:

https://www.mathworks.com/help/reinforcement-learning/ug/train-reinforcement-learning-policy-using-custom-training.html#TrainRLPolicyUsingCustomTrainLoopExample-6

Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

https://www.mathworks.com/help/reinforcement-learning/ug/custom-training-loop-with-simulink-action-noise.html#CustomTrainingLoopWithSimulinkActionNoiseExample-11

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens