Reported Task State not accurate when running on MS HPC grid

1 vue (au cours des 30 derniers jours)
emarch
emarch le 19 Août 2019
Hi,
We are using Matlab 2018a with the parallel toolbox in conjuction with a Matlab parallel server leveraging MS HPC Server 2012 as the scheduler. We've noticed when trying to retrieve task states using the following construct that it is common for incorrect states to be returned:
obj.Job.Tasks.State
For example, when we first start a job it will report pending, then briefly switch to failed before accurately report as running. Are there any tricks to getting these task states to be reported properly?
Thanks for any help.

Réponse acceptée

Edric Ellis
Edric Ellis le 20 Août 2019
Unfortunately, getting accurate state information back from the cluster can be tricky. This is because there are multiple sources of information relating to this - there's the "JobX/TaskY.state.mat" files on disk in your JobStorageLocation. These are created in state pending, the client moves them to queued on submission, and then the worker MATLAB processes set them to be running, and finally finished. There's also the information coming back from querying the underlying scheduling system. These pieces of information can occasionally (and usually transiently) conflict with each other, which leads to spurious states being observed. (It is necessary to query the underlying scheduling system to deal with the case where the worker MATLAB crashes before it gets to set the state file to running or finished.)
If you can, I would recommend using Job.wait as your primary means of waiting for results to become available. (Perhaps with the timeout parameter). This method ought to be more reliable than querying the task State properties directly, as it performs more detailed (and more expensive) checks.

Plus de réponses (0)

Catégories

En savoir plus sur MATLAB Parallel Server dans Help Center et File Exchange

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by