MATLAB crashes when using Reinforcement Learning Toolbox to train an agent using Parallel Computing.

Question

MathWorks Support Team le 30 Juil 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/572839-matlab-crashes-when-using-reinforcement-learning-toolbox-to-train-an-agent-using-parallel-computing

Réponse apportée : MathWorks Support Team le 30 Juil 2020

Réponse acceptée : MathWorks Support Team

I am running the Reinforcement Learning toolbox to train an agent using parallel computing.

When I use 20 cores (+4*16GB gpu) it runs well but when 32cores / 36cores / 40 cores are used, MATLAB 2020a crashes.

Why is the crash happening?

Connectez-vous pour répondre à cette question.

Answer 1

MathWorks Support Team le 30 Juil 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/572839-matlab-crashes-when-using-reinforcement-learning-toolbox-to-train-an-agent-using-parallel-computing#answer_472975

MATLAB might crash while attempting to train a reinforcement learning agent in parallel with ten or more workers. The crash is due to a communication race condition between the client and worker processes.

You can avoid this error by updating MATLAB to R2020a Update 3.

<https://in.mathworks.com/downloads/web_downloads/download_update?release=R2020a&s_tid=ebrg_R2020a_3_2232361>

As a workaround, to bypass the communication race condition for PG, DQN, DDPG, TD3, and PPO agents, use synchronous parallel training and configure the workers to wait until the end of the episode before sending data to the host. To do so, configure your rlTrainingOptions object as shown in the following code:

>> trainOptions = rlTrainingOptions;

>> trainOptions.UseParallel = true;

>> trainOptions.ParallelizationOptions.Mode = "sync";

>> trainOptions.ParallelizationOptions.StepsUntilDataIsSent = -1;

Using StepsUntilDataIsSent = -1 is not supported for AC agents. To avoid a communication race condition for these agents, consider using a PPO agent with experience-based parallel training or a PG agent with gradient-based parallel training.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

MATLAB crashes when using Reinforcement Learning Toolbox to train an agent using Parallel Computing.

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

MATLAB crashes when using Reinforcement Learning Toolbox to train an agent using Parallel Computing.

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens