Effacer les filtres
Effacer les filtres

parfor variable classification issue revisited

5 vues (au cours des 30 derniers jours)
Craig
Craig le 11 Août 2023
Commenté : Jeff Miller le 18 Août 2023
I have a million (literally) text files that I need to read a number from. I currently do this in a nested loop as such:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(prod([len_A, len_B, len_C, len_D, len_E]), 6);
for ind_A = 1 : len_A
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output(line_num, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
end
This is time intensive. Since my disk and processor are not maxed out, I wanted to do this in parallel and speed it up. Based on: https://www.mathworks.com/matlabcentral/answers/838625-parfor-variable-classification-issue, I tried:
output = zeros(prod([5, 6, 7, 8, 9]), 6);
% output = zeros(1, 7);
parfor ind_A = 1 : 5
output_temp = zeros(prod([6, 7, 8, 9]), 6);
count = 0;
for ind_B = 1 : 6
for ind_C = 1 : 7
for ind_D = 1 : 8
for ind_E = 1 : 9
count = count + 1;
line_num = sub2ind([9, 8, 7, 6, 5], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(count, 1:6) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
max_line_num = sub2ind([9, 8, 7, 6, 5], 9, 8, 7, 6, ind_A);
min_line_num = max_line_num - prod([9, 8, 7, 6, 1]) + 1;
output(min_line_num : max_line_num, :) = output_temp;
end
I am unable to figure out how to make this work. I would truly appreciate any help you could provide.

Réponse acceptée

Walter Roberson
Walter Roberson le 11 Août 2023
Clear a multidimensional array. parfor along one of the dimensions, preferably the last.
Within the parfor loop, use nested for loops and multidimensional indexing to assign values to a temporary array that is the right size except for being length 1 along the dimension you are parfor over. After you have assigned all the values to the temporary array,
output(:,:,:,:,INDEX, :) = output_temp;
If you need to, then after the parfor loop, reshape() to collapse those other dimensions.
It is important that the only place you write into the output variable, that the indices be one of ":", or an expression that is constant throughout the parfor, or a linear transform of the parfor variable. Using a computed range like you are doing is Not Permitted.
  2 commentaires
Craig
Craig le 18 Août 2023
Modifié(e) : Craig le 18 Août 2023
By following Walter's suggestions, and after some work such as changing the parfor from Walter's recommendation of the last index to the first, this is what I finally got to work for me:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(len_A, len_B, len_C, len_D, len_E, 6);
parfor ind_A = 1 : len_A
output_temp = zeros(len_B, len_C, len_D, len_E, 6);
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(ind_B, ind_C, ind_D, ind_E, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
output(ind_A, :, :, :, :, :) = output_temp;
end
output = reshape(output, prod([len_A, len_B, len_C, len_D, len_E]), 6);
output = sortrows(output, 1);
Walter Roberson
Walter Roberson le 18 Août 2023
The reason I suggested parfor over the last dimension instead of the first, is that the way multidimensional arrays are stored, the any leading : dimensions are stored in consecutive memory -- so if you had A(:,:,idx) then A(1:end,1:end,idx) would be stored in consecutive memory. But if you had A(idx,:,:) then each piece of data would be size(A,1) apart from each other in memory, which is not as efficient to transfer as consecutive memory.

Connectez-vous pour commenter.

Plus de réponses (1)

Jeff Miller
Jeff Miller le 16 Août 2023
Modifié(e) : Jeff Miller le 18 Août 2023
Maybe something like this would be helpful, using the wonderful allcomb.
idx = allcomb(1:5,1:6,1:7,1:8,1:9);
nrows = size(idx,1);
output = zeros(nrows,6);
parfor ind_row = 1:nrows
idx_A = idx(ind_row,1);
idx_B = idx(ind_row,2);
idx_C = idx(ind_row,3);
idx_D = idx(ind_row,4);
idx_E = idx(ind_row,5);
result = yourActualFn(idx_A,idx_B,idx_C,idx_D,idx_E);
output(ind_row,:) = [idx(1:5), result];
end
  2 commentaires
Craig
Craig le 18 Août 2023
Thanks for the reply Jeff. This might allow the calculation of the "line_num", but I don't see how it would allow me to do all the other work in the real script.
Jeff Miller
Jeff Miller le 18 Août 2023
@Craig, Glad you got the problem solved.
Just for future reference, I edited the script to make it clearer what I thought you might do. Could be that I don't understand what other work you want to do in the real script, though.

Connectez-vous pour commenter.

Produits


Version

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by