Parpool time to launch excessive

15 vues (au cours des 30 derniers jours)
jgg
jgg le 28 Fév 2016
Réponse apportée : Seth le 9 Avr 2018
Hello everyone
I'm working with a large server which is running a large computation. I've been trying to speed this up using the parallel computation toolbox, which means we need to launch and connect to a parallel pool. We've tried several ways of calling this. Right now, it looks like this:
cluster=parcluster('local');
cluster.NumWorkers=parpool_size;
parpool('local', parpool_size)
p = gcp('nocreate'); % If no pool, do not create new one.
if isempty(p)
a = 0
else
a = p.NumWorkers
end
The key line is the parpool command. The size is 32 right now, but we're hoping to scale it up to 200 or so if this works well.
Unfortunately, we cannot seem to get the parpool command to operate. It does not throw an error or crash, but is taking in excess of 4 hours to execute (at which point the job times out). Does anyone has any idea why this might be the case, or if there are any suggestions which can be taken to improve execution time?
If it is relevant, to improve speed, we are running this under the 2014b MATLAB compiler on a Linux based system (v83) but we see identical problems with 2015b as well (v90).

Réponse acceptée

jgg
jgg le 30 Mar 2016
I'm going to answer this question for myself, approximately.
It seems the issue was not specific to the MATLAB2014b compiler; we replicated the issue with 2015b and 2013b (although the number of processors was much smaller).
It seems the issue had to do with the data loading. Basically, our workflow was like this:
setUpParameters();
loadData();
createParallelPool();
doEstimation();
In this case, the loadData() stage was very large, loading in several gigs of data into memory, including a set of anonymous functions. This was fine, hardware-wise, but it seems that upon parallel pool creation, this made it very slow. We believe (but are not sure) that the creation was so slow because it was replicating the memory several hundred times. Precisely why this was slow, I am not sure.
However, we were able to resolve the problem by moving the createParallelPool() command to the beginning of our function, so the revised workflow looked like:
createParallelPool();
setUpParameters();
loadData();
doEstimation();
Because no jobs were assigned to the pool before the doEstimation() stage, this did not encounter the same problem we had earlier. Essentially, we dramatically reduced the amount of memory necessary to transfer to the other instances of Matlab.
As an aside, we ended up using MATLAB2015b because the 2014b version has great difficulty with anonymous functions in a parallel environment; it would quickly use far too much memory, then page, crashing the program.
Lessons learned:
  • Use as little data in RAM as possible when doing parallel computations, and especially avoid complex data-types. The matfile command is very useful.
  • Declare parallel pools as early as possible in your program to reduce overhead.

Plus de réponses (2)

Jon Russo
Jon Russo le 25 Oct 2017
I am running matlab R2017a and when I try to start a parallel pool of size 4 it takes about 30 minutes. Any ideas what might be going on?

Seth
Seth le 9 Avr 2018
The issue regarding very slow parallel pool initiation start times have been mostly related to matlab license checking operations when the parpool command is called. Specifically, we found things like modifying if matlab is checking network license servers and local license files (or both, redundantly) was helpful in optimizing the parpool() start times, however it's still much slower than it should be.

Catégories

En savoir plus sur Parallel Computing Fundamentals dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by