Parpool errors on SLURM computing system

2 vues (au cours des 30 derniers jours)
Caleb_Holt
Caleb_Holt le 14 Août 2018
I'm running a script that relies on a parfor loop. To initialize my parpool I use the command
parpool(20) % where 20 is the number of cores I have requested.
Occasionally I get the error
Error using parpool (line 113)
Failed to start a parallel pool. (For information in addition to the causing
error, validate the profile 'local' in the Cluster Profile Manager.)
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line
675)
Failed to start pool.
Error using parallel.Job/submit (line 351)
Error closing file
/gpfs/home/cholt/.matlab/local_cluster_jobs/R2017b/Job47.in.mat.
The file may be corrupt.
Sometimes my code works fine. And sometimes I get that error. I don't really understand what's happening and why that error appears sometimes but not always. How can I fix it? Thanks for your help -C

Réponses (2)

Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath le 17 Août 2018
Hey Caleb,
It is possible that one or more of the workers never managed to fully start. This may have been caused by certain preference settings. Try to regenerate the MATLAB preferences.
Also can you please check the "startup.m" file if it contains any commands which will alter the MATLAB preferences. If yes, try commenting out these and run the program again.
Regards
Zenin
  1 commentaire
Caleb_Holt
Caleb_Holt le 17 Août 2018
Hey Zenin
I haven't messed with the preference settings at all, that file is still empty. I also don't have a startup.m file. I think the problem is that it is trying to access a job file that doesn't exist. I'm not sure if I should make those files or what needs to happen.

Connectez-vous pour commenter.


Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath le 20 Août 2018
Hey Caleb,
I understand that you haven't made any changes to the preference settings. In that case, it could mean that the MATLAB Cluster may not have been validated. Please refer to these pages in order to do the same:
Please let me know if this works for you.
Regards
Zenin
  3 commentaires
Caleb_Holt
Caleb_Holt le 20 Août 2018
In particular these lines of the error message
Error using parallel.Job/submit (line 351) Error closing file /gpfs/home/cholt/.matlab/local_cluster_jobs/R2017b/Job47.in.mat. The file may be corrupt.
I think the fact that I've reached my space limitation on my home directory, where the .matlab file is stored, is what's throwing this error. How can I change where that .out.mat file is stored?
Zenin Easa Panthakkalakath
Zenin Easa Panthakkalakath le 21 Août 2018
Hey Caleb,
The directory that you've mentioned is the preference directory.
>>prefdir % this would return the preference directory.
In order to set a custom preference directory, please try the solution in this article .
Regards
Zenin

Connectez-vous pour commenter.

Catégories

En savoir plus sur Cluster Configuration dans Help Center et File Exchange

Produits


Version

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by