How can I make more efficient use of parallel workers?
2 views (last 30 days)
My code performs 30 iterations of a task via parfor, with 6 workers in the parallel pool. Each iteration writes files when it starts and again when it finishes, so I can see at each moment which iterations are currently underway and which are already finished. For example, when the parfor loop first starts, it creates 6 starter files as expected.
What bothers me is that sometimes after an hour or so I can see that only (say) 2 workers are underway and yet (say) 3 iterations haven’t started at all,. I’m wondering how to change my code so that some of the at-that-point-idle workers would go ahead and start the unstarted iterations.
I suppose this happens because some of the iterations take much longer than others (e.g., 2 minutes versus 2 hours). From my understanding of the parforoptions RangePartitionMethod parameter, it appears that parfor allocates iterations to workers in advance. Thus, if a particular worker happens to get assigned all of the slow iterations, that worker could potentially still be on its first of 5 iterations even after the other workers had finished all 5 of theirs. In that case most of the workers might be idle most of the time, which is obviously inefficient.
If I could predict in advance which iterations would be the slow ones, it looks like I could improve efficiency by distributing the slow tasks evenly across workers using RangePartitionMethod. Unfortunately, I can’t predict which iterations will be slow. Presumably MATLAB provides another, better way to arrange the parallelism in situations like this. Hence, my question in the title of this post.
Thanks for any tips.
Edric Ellis on 4 Feb 2022
A couple of suggestions: you could consider using DataQueue rather than writing files to track progress on the workers. Yes, it sounds like your extreme mismatch in loop iteration durations is not working well with the default parfor loop division of work. You could indeed use parforOptions to split things up into smaller pieces, like this:
q = parallel.pool.DataQueue();
afterEach(q, @(msg) fprintf('Got message: %s\n', msg));
opts = parforOptions(gcp(), 'RangePartitionMethod', 'fixed', 'SubrangeSize', 1);
parfor (i = 1:10, opts)
send(q, sprintf('Starting iteration: %d', i));
send(q, sprintf('Finished iteration: %d', i));
By selecting a 'SubrangeSize' of 1, each loop iteration will be sent separately to the workers. Usually this is not a good idea, since it increases the overheads of execution - but it might work better for your case.