MATLAB Answers

0

Debugging parfor

Asked by Joan Puig on 15 Jun 2011
Latest activity Answered by Dimitrij Chudinzow on 29 Jun 2017
Hi,
We have been working on paralelizing our code, and we have found that when an error occurs inside a parfor it is hard to debug it. What methods do you use to figure out what is going on (specially when the serial version works perfectly)?
More specifically, when I run this code, I would expect to get a My:Error with the stack trace pointing to that particular line of code, but instead, we get an error pointing to an internal function and the stack trace shows the "parfor" line as being the source of the problem
clear();
clc();
r = [];
try
parfor i = 1:10
r(i) = rand(1,1);
if r(i)<0.9
error('My:Error','Try again');
end
end
catch le
le
for j = 1:numel(le.stack)
le.stack(j)
end
rethrow(le);
end
Output:
le =
MException
Properties:
identifier: 'My:Error'
message: 'Try again'
cause: {0x1 cell}
stack: [2x1 struct]
Methods
ans =
file: 'C:\Program Files\MATLAB\R2011a\toolbox\matlab\lang\parallel_function.m'
name: 'parallel_function'
line: 475
ans =
file: 'D:\SynapticPoint\SourceTrunk\Matlab\ScratchPad\scr_error_in_parfor.m'
name: 'scr_error_in_parfor'
line: 7
??? Error using ==> parallel_function at 475 Try again
Error in ==> scr_error_in_parfor at 7 parfor i = 1:10
>>

  0 Comments

Sign in to comment.

3 Answers

Edric Ellis
Answer by Edric Ellis
on 16 Jun 2011

Firstly, there should be few differences between running your code containing PARFOR with MATLABPOOL closed and with MATLABPOOL open, except that you can set breakpoints inside code called from within functions called. I.e. if you have code like:
parfor ii=1:10
x(ii) = myFcn(ii);
end
You can set breakpoints inside myFcn().
Secondly, if you put your code inside a function rather than a script, you should get better diagnostics. I simplified your code a little:
function pfeg
try
parfor i = 1:10
if rand < 0.9
error('My:Error','Try again');
end
end
catch le
getReport( le )
end
and this now gets the output:
Error using ==> parallel_function at 598
Error in ==> pfeg>(parfor body) at 5
Try again
Error in ==> pfeg at 3
parfor i = 1:10

  0 Comments

Sign in to comment.


Answer by Joan Puig on 16 Jun 2011

Its true that the computational parts of the code generate the same errors with or without the matlabpool open, which is a good thing.
On the other hand, we have some "configuration" problems where for example:
-The java class path is not set correctly on the workers. -The state of the data cache might be different on the different workers -Database connections on the workers might be in a different state -Datafeed connections on the workers might be in a different state
All this situations are very hard to debug if we can't even find out what line of code is causing the problem

  1 Comment

Edric Ellis
on 17 Jun 2011
Hi Joan, do you *not* get the line of code in the error stack when the parfor loop is inside a function body?
And yes, we only ensure that the MATLAB path is synchronised between client and workers, you must deal with other setup that's required.

Sign in to comment.


Answer by Dimitrij Chudinzow on 29 Jun 2017

My approach is to replace "parfor" by "for". This way you will find the line that causes troble, but unfortunately it'll take more time, since parallel computing will be disbaled for the particular loop.

  0 Comments

Sign in to comment.