Working around segmentation violations in parfor loops
Afficher commentaires plus anciens
I am currently working on a project where I have to run multiple repetitions of a time-consuming MATLAB function in parallel. For the purposes of this question, let's refer to the function as myfunc.
myfunc uses a MEX file and ends up with a random segmentation violation once every 3 hours. I cannot diagnose the segmentation fault since it originates from a propriety API that I did not code myself. However, I do know that it occurs within the MEX file, and I also know that it is not deterministically related to any settings I can change.
I would like to work around the segmentation violation, and I would ideally also like to keep on using the parfor function in MATLAB. My idea right now is to use a try catch loop within the parfor loop as follows:
%create an output cell to store nreps of output from 'myfunc'
output = cell(1,nreps)
%create a vector to track # of runs that finish successfully
successfulrun = zeros(1,nreps);
% run myfunc in parallel
parfor i = 1:nreps
try
output{i}
successfulrun(i) = true
end
end
%rerun experiments that did not end up successfully
while sum(successulruns) < nreps
%count # of experiments to rerun
%initialize variables to store new results
reps_to_rerun = find(successfulruns == 0);
nreps_to_rerun = sum(reps_to_rerun);
newoutput = cell(1,nreps_to_rerun);
newsuccessfulrun = zeros(1,nreps_to_rerun)
%rerun experiments
parfor i = 1:nreps_to_rerun
try
newoutput{i};
newsuccessfulrun = true;
end
end
%transfer contents to larger loop
for i = 1:nreps_to_rerun
rerun_index = reps_to_rerun(i);
successfulrun(rerun_index) = newsuccessfulrun(i)
if newsuccessfulrun(i)
output{i} = newoutput{i};
end
end
end
My questions are:
1. Will it be OK to keep continuing to run more repetitions like this even though there was a segmentation violation within the MEX file? Or should I clear the memory / restart the matlabpool? I'm assuming this shouldn't be problem since the segmentation violation was in C.
2. Is there any way to "break" out of a parfor loop?
5 commentaires
Kaustubha Govind
le 13 Avr 2012
I don't think try-catch will help w.r.t. SegV's, because they usually end the process that is being run. In other words, MATLAB will be shut down due to the SegV.
Ken Atwell
le 14 Avr 2012
Kaustubha is correct -- try/catch will trap a MATLAB error, but not a segfault triggered by the OS.
Berk Ustun
le 23 Avr 2012
Sean de Wolski
le 23 Avr 2012
Does it occur with a regular for-loop?
Berk Ustun
le 23 Avr 2012
Réponses (0)
Catégories
En savoir plus sur Parallel Computing Toolbox dans Centre d'aide et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!