Handling memory when working with very huge data (.mat) files.
Afficher commentaires plus anciens
I am working with two 5D arrays (A5D and B5D) saved in a big_mat_file.mat file. The size of these arrays is specified in the code below. The total size of big_mat_file.mat file is around 20GB. I want to perform three simple operations on these matrices, as shown in the code. I have access to my university's computing cluster. When I run the following code with 120 workers and 400GB of memory, I receive the following error
In distcomp/remoteparfor/handleIntervalErrorResult (line 245) In distcomp/remoteparfor/getCompleteIntervals (line 395) In parallel_function>distributed_execution (line 746) In parallel_function (line 578)
Can someone please help me understanding what is causing this error. Is it because of low memory? It there anyother way to do the following operattions?
clear; clc;
load("big_mat_file.mat");
% it has two very huge 5D arrays "A5D" and "B5D", and two small arrays "as" and "bs"
% size of both A5D and B5D is [41 16 8 80 82]
% size of "as" is [1 80] and size of "bs" is [1 82]
xs = -12:0.1:12;
NX = length(xs);
ys = 0:0.4:12;
NY = length(ys);
total_iterations = NX * NY;
results = zeros(total_iterations , 41 , 16, 8);
XXs = zeros(total_iterations, 1);
YYs = zeros(total_iterations, 1);
parfor idx = 1:total_iterations
[ix, iy] = ind2sub([NX, NY], idx);
x = xs(ix);
y = ys(iy);
term1 = 1./(exp(1/y*(A5D-x)) + 10); %operation 1
to_integrate = B5D.*term1; %operation 2
XXs(idx) = x;
YYs(idx) = y;
results(idx, :, :, :) = trapz(as,trapz(bs,to_integrate,5),4); %operation 3
end
XXs = reshape(XXs, [NX, NY]);
YYs = reshape(YYs, [NX, NY]);
results = reshape(results, [NX, NY, 41, 16, 8]);
clear A5D B5D
save('saved_data.mat','-v7.3');
Réponse acceptée
Plus de réponses (1)
Sam Marshalik
le 30 Août 2024
0 votes
You are likely running out of memory on the workers. You are not using sliced input variables (Sliced Variables - MATLAB & Simulink (mathworks.com) to access the 5D matrices and are sending the entire copy to each worker. They are likely big enough that you are running out of memory on those machines. I would suggest to run less workers (to give them access to more memory per worker), try using sliced input variables and pass only part of the matrix to the workers, or run on machines with more memory.
To test this theory, you can run your work and monitor memory usage on those machines - if this is the issue, you should see it max out.
1 commentaire
Luqman Saleem
le 31 Août 2024
Catégories
En savoir plus sur Parallel for-Loops (parfor) dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!