MATLAB cluster on AWS-EC2 - validation fails

16 vues (au cours des 30 derniers jours)
Kiran R
Kiran R le 10 Août 2022
Commenté : Kiran R le 13 Août 2022
I was trying to setup a cluster in Amazon Web Service using MATLAB parallel server. AWS side is done and when I configure cluster on my local machine MATLAB, the validation fails. Following is the screen shot of the case. I set the p3.2xlarge instance on AWS with 8 vCPU cores, single 16 GB nvidia V100 GPU
" Error Report: Job errored or did not reach the state 'finished'.
Job was cancelled because the cluster does not have enough workers to meet the minimum
of the job's NumWorkersRange property: 1."
  1 commentaire
Alison Eele
Alison Eele le 12 Août 2022
What the validation is trying to tell you is that the whilst the Job Scheduler is available and accessible in the cloud there doesn't seem to be any workers started. You could confirm how many workers the job scheduler thinks it has by running:
c = parcluster("kr p3 2xlarge");
c.NumWorkers
If you're using the reference architecture I linked earlier then as part of deployment you needed to state how many worker instances and number of workers per instance, is it possible you turned down the auto scaling group to 0?
If you are deploying this yourself entirely manually did you run the 'startworker' command as part of your mjs setup?
Our Installation Support team are likely best placed to help you fix this: https://www.mathworks.com/support/contact_us.html

Connectez-vous pour commenter.

Réponse acceptée

Kiran R
Kiran R le 12 Août 2022
Thank You Alison Eele
I understand that AWS by default has set number of workers to zero for the P3 instance. I have put up a request to increase this to 8. Let me see if it works, once the workers are increased.
When I tried C5 instance, the validation was successful.
Thanks once again for pointing out this.
  2 commentaires
Alison Eele
Alison Eele le 12 Août 2022
Ah I see, your worker instances weren't provisioned due to the default usage limits AWS applies. Glad you're on track to figuring out a solution
Kiran R
Kiran R le 13 Août 2022
Yes Eele,
Its now validated successfully. Thanks for pointing this.

Connectez-vous pour commenter.

Plus de réponses (0)

Produits


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by