Each PARFOR Worker Writes to the Same File

16 views (last 30 days)
I understand that having each worker write to a single file is a no-no. Perhaps as expected, when I run this code, it shows some corrupted values in the final output file; about 10% fails.
I have no interest in the output being in a deterministic order. The workers are spread across multiple Linux machines. The amount of time to complete a run is long compared to the time to write a single line of output.
Can someone recommend an alternative?
% Run a parametric study
var1 = (-60:0.5:60)';
var2 = (-110:0.5:110)';
var3 = (3.5:0.5:18.5)';
% Remove zero entries since their usage prohibited
var1(var1 == 0) = [];
var2(var2 == 0) = [];
var3(var3 == 0) = [];
NS = length(var1)*length(var2)*length(var3); % Number of runs
% Set up the design matrix, desMat
desMat = {var1,var2,var3};
[desMat{:}]=ndgrid(desMat{:});
n=length(desMat);
desMat = reshape(cat(n+1,desMat{:}),[],n);
if exist('./Results.csv', 'file')==2
delete('./Results.csv');
end
parfor kk = 1:NS
var1a = desMat(kk,1); var2a = desMat(kk,2); var3a = desMat(kk,3);
[out1 out2 out3] = Function_Pd(var1a,var2a,var3a);
vec = [var1a var2a var3a out1 out2 out3];
fileID = fopen('Results.csv','a');
fprintf(fileID,'%f %f %f %f %f %f\n',vec);
fclose(fileID);
end
  2 Comments
Paul Safier
Paul Safier on 3 Aug 2022
@jessupj The individual files would work except there would be over 3 million files and I believe file servers can have issues with that many...

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 3 Aug 2022
Edited: Bruno Luong on 3 Aug 2022
May be (I didn't test) you could write in binary file at a deterministic place:
fileID = fopen('Results.bin','wb');
parfor ...
...
fseek(fileID, (kk-1)*length(vec)*8, 'bof'); % 8 is byte size of double, assuming vec is double
fwrite(fileID, vec);
end
fclose(fileID);
  5 Comments
Paul Safier
Paul Safier on 4 Aug 2022
@Walter Roberson thanks for the advice. I will look into how to use memmapfile. It may also be an option to fill a binary file with garbage before I start the parfor loop, then overwrite it as I go because of the nature of fseek you mention. There's still the option that @Jeff Miller brings to light, namely having each worker write to its own file than I can cat them at the end, however that may take some leg work to get running. Thanks for the suggestion about the binary file, @Bruno Luong.

Sign in to comment.

More Answers (2)

Jeff Miller
Jeff Miller on 3 Aug 2022
Maybe have each worker write to its own output file and then assemble those after? This answer shows how to get the id for each worker.
  3 Comments
Paul Safier
Paul Safier on 4 Aug 2022
@Jeff Miller I think it was to write each iteration to its own file, which would be too much for the file server (>3 million files). Your suggestion about each worker writing to its own file would be a managable amount of files. I will look into the link you sent. Thanks.

Sign in to comment.


Raymond Norris
Raymond Norris on 12 Aug 2022
@Paul Safier since the order of the file doesn't have to be deterministic, use a data queue to write back to the client and have the client write the csv file.
% Run a parametric study
var1 = (-60:0.5:60)';
var2 = (-110:0.5:110)';
var3 = (3.5:0.5:18.5)';
% Remove zero entries since their usage prohibited
var1(var1 == 0) = [];
var2(var2 == 0) = [];
var3(var3 == 0) = [];
NS = length(var1)*length(var2)*length(var3); % Number of runs
% Set up the design matrix, desMat
desMat = {var1,var2,var3};
[desMat{:}]=ndgrid(desMat{:});
n=length(desMat);
desMat = reshape(cat(n+1,desMat{:}),[],n);
if exist('./Results.csv', 'file')==2
delete('./Results.csv');
end
fileID = fopen('Results.csv','a');
D = parallel.pool.DataQueue;
afterEach(D,@(V)logger(fileID,V))
c = onCleanup(@()fclose(fileID));
parfor kk = 1:NS
var1a = desMat(kk,1); var2a = desMat(kk,2); var3a = desMat(kk,3);
[out1 out2 out3] = Function_Pd(var1a,var2a,var3a);
vec = [var1a var2a var3a out1 out2 out3];
send(D,vec)
end
function logger(fileID,vec)
fprintf(fileID,'%f %f %f %f %f %f\n',vec);
end
  4 Comments

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by