Troubles with counter variables and file writings in parallelized for loop

3 views (last 30 days)
Dominik Hiltbrunner
Dominik Hiltbrunner on 18 Nov 2021
Dear community
I am currently trying to speed up the computation of a nested for-loop. To do so, I want to use the parallel computing toolbox in order to split up the outer most loop to different workers. My programm has the following structure:
clc; clear; close all; format shorteng;
A = [20 50 100 200];
B = [1 2 3 4];
C = [-8 -6 -4 -2];
D = [-1 -2 5 8];
E = 1e-6*[1 2 3 4];
X = [3 9 -21 4];
Y = 1e-3*[3 3 4 4];
cnt1 = 0;
cnt2 = [0 0];
N_itr = length(A)*length(B)*length(C)*length(D)*length(E);
res = [];
foo = [];
progress_bar = waitbar(0, 'Starting');
result_file = fopen('dummy.txt','w');
for a=A
for b=B
for c=C
for d=D
for e=E
cnt1 = cnt1 + 1;
res = [res, b+c*e*max(abs(eig(rand(a))))];
is_valid = true;
for x=X
for y=Y
if(a*c>0)
cnt2(1) = cnt2(1) + 1;
is_valid = false;
end
if(b*y<0)
cnt2(2) = cnt2(2) + 1;
is_valid = false;
end
if(c*x>0)
foo = [foo, a-b+c-d+e-x+y*max(abs(eig(rand(10))))];
end
end
end
if(is_valid==true)
fprintf(result_file,'%e %e %e %e %e\n',a,b,c,d,e);
end
waitbar(cnt1/N_itr, progress_bar, sprintf('Progress: %d %%', floor(cnt1/N_itr*100)));
end
end
end
end
end
fclose(result_file);
close(progress_bar);
The code above does not do anything useful and is only for demonstrative purposes. My code fulfills the following properties:
  • It does not rely on the execution order
  • It does not rely on the indices of the iteration variables
  • When incrementing counters, all indices are hard-coded. None of them are assigned depending on any variable.
  • No variables outside the for-loop are overwritten except for the counters
  • But there will be some result variables that are dynamically extended, but they are never read nor overwritten within the loop
Below is a functional implementation of above's programm that uses the parfor of the parallel computing toolbox. Notice, however, that some parts of the code have been commented out as they cause troubles.
clc; clear; close all; format shorteng;
A = [20 50 100 200];
B = [1 2 3 4];
C = [-8 -6 -4 -2];
D = [-1 -2 5 8];
E = 1e-6*[1 2 3 4];
X = [3 9 -21 4];
Y = 1e-3*[3 3 4 4];
cnt1 = 0;
cnt2 = [0 0];
N_itr = length(A)*length(B)*length(C)*length(D)*length(E);
res = [];
foo = 0;
%progress_bar = waitbar(0, 'Starting');
result_file = fopen('dummy.txt','w');
parfor k=length(A)
a = A(k); % Must be converted this way
for b=B
for c=C
for d=D
for e=E
cnt1 = cnt1 + 1;
res = [res, b+c*e*max(abs(eig(rand(a))))];
is_valid = true;
for x=X
for y=Y
if(a*c>0)
% Problem 2: Cannot use cnt2 here
%cnt2(1) = cnt2(1) + 1;
is_valid = false;
end
if(b*y<0)
% Problem 2: Cannot use cnt2 here
%cnt2(2) = cnt2(2) + 1;
is_valid = false;
end
if(c*x>0)
foo = [foo, a-b+c-d+e-x+y*max(abs(eig(rand(10))))];
end
end
end
if(is_valid==true)
% Problem 3: Cannot write to file
%fprintf(result_file,'%e %e %e %e %e\n',a,b,c,d,e);
end
% Problem 1: Cannot use "cnt1" here
%waitbar(cnt1/N_itr, progress_bar, sprintf('Progress: %d %%', floor(cnt1/N_itr*100)));
end
end
end
end
end
fclose(result_file);
%close(progress_bar);
Problem 1
I want to keep track of the progress with the help of a waitbar. To do so, I divide the counter cnt1 by the total number of iterations N_itr. This does, of course, not work. Is there another way to keep track of the progress?
Problem 2
For some reason I cannot increment the counter cnt2 although this variable does not rely on execution order or iteration indices. I get the following error:
Error: Unable to classify the variable 'cnt2' in the body of the parfor-loop. For more information, see Parallel for Loops in MATLAB, "Solve Variable Classification Issues in parfor-Loops".
According to the documentation, cnt2 is just a rediction variable and this should be fine. What is the problem here?
Problem 3
I want to write some solutions to a text file. As expected, this does not work, i.e. the different workers cannot assign the file alltogether. Interestingly, the error I get is:
Invalid file identifier. Use fopen to generate a valid file identifier.
But commenting out the fprintf() line fixes the problem. Is there an easy way how to write results on files? Can I, for example, open individual files for each worker? If yes. how can I do that and how to assign them?
Thank you!
  1 Comment
Jan
Jan on 18 Nov 2021
Just a hint:
for x=X
for y=Y
if(a*c>0)
% Problem 2: Cannot use cnt2 here
%cnt2(1) = cnt2(1) + 1;
is_valid = false;
end
If the contents of some code does not depend on the loop counters, move it outside the loops to save processing time.
Did you see the bunch of waitbar implementations for parfor loops: https://www.mathworks.com/matlabcentral/fileexchange?q=parfor+waitbar ?

Sign in to comment.

Answers (2)

Raymond Norris
Raymond Norris on 18 Nov 2021
For Problem 1, use a parallel.pool.DataQueue
In fact, one of the examples shows using a waitbar.
For Problem 2, as Jan points out, not clear if you need/use cnt2. I'll give this some more thought.
For Problem 3, the file identifier is valid on the client, but not on the worker (it needs to be opened on the worker). I agree that commenting out the call to fprintf "fixes the problem", but then you still aren't writing to the file, which is what you want to do. Two ways come to mind in order to solve this problem.
  • Bring all the data back and write the file from the client. You'll most likely use a sliced output variable (versus a reduction)
  • Open unique files from the workers. You'll then need to merge the files back on the client. This may be tricky if the order in the file matters (which it shouldn't since it's in a parfor loop) and if so, if there's any easy way to true the data. Additionally, you'll need to find a way to bring the results back over (not sure if you're running the workers on your local machine or on a remote cluster. To get unique file names, use some combination of loop variable (k), time stamp, process id (e.g., feature getpid). The advantage of the latter two is that they will be unique for subsequent runs.
  1 Comment
Dominik Hiltbrunner
Dominik Hiltbrunner on 18 Nov 2021
Comment about the cnt2 variable:
The example I postet shall only demonstrate the purpose, and I more or less typed in random variables in the if-statements. In the real file, I do need the counters (there are are like 30 if-statments, each with an individual counter such that I can see after the loop which one failed the most). I apologize for posting a confusing, not-so-optimally-chosen example.
Comment about the file writing problem:
The order does in fact not matter at all. So having differrent workers writing to individual files is a valid solution for me. I can still merge and sort them afterwards. Also, so far everything is running locally on my machine, i.e. I just want to get the advantage of running a multi-core program. So again, getting the files back from the workers is no problem.
Can you tell me how to open unique files for each worker?

Sign in to comment.


Dominik Hiltbrunner
Dominik Hiltbrunner on 18 Nov 2021
Edited: Dominik Hiltbrunner on 19 Nov 2021
I want to answer my own question according to the answers I got.
The DataQueue can be used to solve all 3 problems.
My approach:
I define 3 data queues
DQ_waitbar = parallel.pool.DataQueue;
DQ_counter = parallel.pool.DataQueue;
DQ_file_writer = parallel.pool.DataQueue;
Then I assign function handlers for each
afterEach(DQ_waitbar,@parforWaitbar);
afterEach(DQ_counter,@incrementCounter);
afterEach(DQ_file_writer,@writeToFile);
The corresponding functions are defined as follows
function writeToFile(res)
persistent file_handler
if isempty(file_handler)
file_handler = fopen('dummy.txt','w');
end
fprintf(file_handler,'%e %e %e %e %e\n',res(1),res(2),res(3),res(4),res(5));
if(res(end)==true)
fclose(file_handler);
end
end
function counter = incrementCounter(counter_array)
persistent cnt
if isempty(cnt)
cnt = [0 0];
end
cnt = cnt + counter_array;
counter = cnt;
end
function parforWaitbar(waitbarHandle,iterations)
persistent count h N
if nargin == 2
count = 0;
h = waitbarHandle;
N = iterations;
else
if isvalid(h)
count = count + 1;
waitbar(count / N,h,sprintf('Progress: %d %%', floor(count/N*100)));
end
end
end
Then I initialize the persistent data
N_workers = 4;
progress_bar = waitbar(0, 'Starting');
parforWaitbar(progress_bar,ceil(N_itr/N_workers));
incrementCounter([0 0]);
writeToFile([0 0 0 0 0 0]);
Inside the loop, I can then use
% Somewhere inside the loop
send(DQ_counter,[1 0]);
send(DQ_file_writer,[a,b,c,d,e,0]);
send(DQ_waitbar,[]);
Which works great.
Full code example below:
clc; clear; close all; format shorteng; clear functions;
A = [20 50 100 200];
B = [1 2 3 4 5 6];
C = [-8 -6 -4 -2];
D = [-1 -2 5 8];
E = 1e-6*[1 2 3 4];
X = [3 0 -21 4];
Y = 1e-3*[3 3 1 4];
N_itr = length(A)*length(B)*length(C)*length(D)*length(E);
N_workers = 4;
res = [];
foo = [];
DQ_waitbar = parallel.pool.DataQueue;
DQ_counter = parallel.pool.DataQueue;
DQ_file_writer = parallel.pool.DataQueue;
afterEach(DQ_waitbar,@parforWaitbar);
afterEach(DQ_counter,@incrementCounter);
afterEach(DQ_file_writer,@writeToFile);
% Initialize
progress_bar = waitbar(0, 'Starting');
parforWaitbar(progress_bar,ceil(N_itr/N_workers));
incrementCounter([0 0]);
writeToFile([0 0 0 0 0 0]);
parfor k=length(A)
a = A(k); % Must be converted this way
for b=B
for c=C
for d=D
for e=E
res = [res, b+c*e*max(abs(eig(rand(a))))];
is_valid = true;
for x=X
for y=Y
if(a*x-100*d==0)
send(DQ_counter,[1 0]);
is_valid = false;
end
if(b*y==0)
send(DQ_counter,[0 1]);
is_valid = false;
end
if(c*x>0)
foo = [foo, a-b+c-d+e-x+y*max(abs(eig(rand(10))))];
end
end
end
if(is_valid==true)
send(DQ_file_writer,[a,b,c,d,e,0]);
end
send(DQ_waitbar,[]); % Send empty data to queue to increment the waitbar
end
end
end
end
end
close(progress_bar);
writeToFile([0 0 0 0 0 1]);
incrementCounter([0 0])
function writeToFile(res)
persistent file_handler
if isempty(file_handler)
file_handler = fopen('dummy.txt','w');
end
fprintf(file_handler,'%e %e %e %e %e\n',res(1),res(2),res(3),res(4),res(5));
if(res(end)==1)
fclose(file_handler);
end
end
function counter = incrementCounter(counter_array)
persistent cnt
if isempty(cnt)
cnt = [0 0];
end
cnt = cnt + counter_array;
counter = cnt;
end
function parforWaitbar(waitbarHandle,iterations)
persistent count h N
if nargin == 2
count = 0;
h = waitbarHandle;
N = iterations;
else
if isvalid(h)
count = count + 1;
waitbar(count / N,h,sprintf('Progress: %d %%', floor(count/N*100)));
end
end
end
  1 Comment
Dominik Hiltbrunner
Dominik Hiltbrunner on 19 Nov 2021
Update:
I am not entirely happy with my solution. Keepting track of many counters with a data queue results in a significant overhead, i.e. the overall execution gets slowed down by a lot.
Anyone has suggestions on how to implement counters more efficiently?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by