Fast reorganization of cell arrays

Question

Christopher Smith le 15 Fév 2025

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2174076-fast-reorganization-of-cell-arrays

Commenté : Stephen23 le 6 Avr 2025

I have two cell arrays which store row indices to other arrays for processing. TB is a column cell array that stores the indices for all of the "targets" that interact with "sources" in the SB column cell array. The cell index of TB corresponds to the same cell index of SB. In other words, all of the indices in TB{1} interact with all of the indices in SB{1}. I would like to out of TB all of the unique indices, then find all of the indices in SB that interact with each unique index in TB. The tricky bit is that some of the indices in TB are repeated across cells. I have included a minimum working example that does what I want. It runs quickly enough for small arrays, but becomes slow when the TB and SB arrays/cells get large. Is there a way to make this faster and more compact? Thank you for any insight!

% build the two arrays
TB = cell(3,1); % "target" array
TB{1} = [1;2;3;4;5];
TB{2} = [4;5;6];
TB{3} = [7,8,9,10];
SB = cell(3,1); % "source" array
SB{1} = (1:1:15)';
SB{2} = (16:1:30)';
SB{3} = (31:1:45)';
% build the intermediate sorted array
NTB = length(TB);
out1 = []; % intermediate array 
for cnt = 1:NTB
    for cnt2 = 1:length(TB{cnt})
        out1 = [out1;repmat(TB{cnt}(cnt2),size(SB{cnt},1),1),SB{cnt}];
    end
end
% pull out of the intermediate array the desired arrays
TB_desired = unique(out1(:,1)); % all of the unique "targets"
NTB_desired = length(TB_desired);
SB_desired = cell(NTB_desired,1);
for cnt = 1:NTB_desired
    logvar = TB_desired(cnt) == out1(:,1);
    SB_desired{cnt} = out1(logvar,2); % all of the "sources" that interact with each unique "target"
end

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Stephen23 le 15 Fév 2025

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2174076-fast-reorganization-of-cell-arrays#answer_1559834

Modifié(e) : Stephen23 le 15 Fév 2025

Ouvrir dans MATLAB Online

The timings might be similar for small arrays, but try testing on larger arrays!

tic
% Initialize target and source arrays
tgt = {[1;2;3;4;5],[4;5;6],[7,8,9,10]};
src = {(1:15).',(16:30).',(31:45).'};
% Preallocate the intermediate array for better performance
len = sum(cellfun(@numel,tgt) .* cellfun(@numel,src));
arr = zeros(len,2);
% Build the intermediate array more efficiently
idx = 1;
for ii = 1:numel(tgt)
    val = tgt{ii}(:);  % Ensure column vector
    dat = src{ii};
    % Calculate indices for current block:
    num = numel(val) * numel(dat);
    pos = idx:(idx+num-1);
    % Create pairs using vectorized operations:
    [tmp,raw] = ndgrid(val,dat);
    arr(pos,:) = [tmp(:),raw(:)];
    %
    idx = idx + num;
end
% Create final mapping:
[tgx,~,ids] = unique(arr(:,1));
out = accumarray(ids,arr(:,2),[],@(x){x});
toc
Elapsed time is 0.014116 seconds.
SB_desired1 = out
SB_desired1 = 10x1 cell array
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {30x1 double}
    {30x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}

Compared with your original code:

tic
% build the two arrays
TB = cell(3,1); % "target" array
TB{1} = [1;2;3;4;5];
TB{2} = [4;5;6];
TB{3} = [7,8,9,10];
SB = cell(3,1); % "source" array
SB{1} = (1:1:15)';
SB{2} = (16:1:30)';
SB{3} = (31:1:45)';
% build the intermediate sorted array
NTB = length(TB);
out1 = []; % intermediate array 
for cnt = 1:NTB
    for cnt2 = 1:length(TB{cnt})
        out1 = [out1;repmat(TB{cnt}(cnt2),size(SB{cnt},1),1),SB{cnt}];
    end
end
% pull out of the intermediate array the desired arrays
TB_desired = unique(out1(:,1)); % all of the unique "targets"
NTB_desired = length(TB_desired);
SB_desired0 = cell(NTB_desired,1);
for cnt = 1:NTB_desired
    logvar = TB_desired(cnt) == out1(:,1);
    SB_desired0{cnt} = out1(logvar,2); % all of the "sources" that interact with each unique "target"
end
toc
Elapsed time is 0.015339 seconds.
SB_desired0
SB_desired0 = 10x1 cell array
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {30x1 double}
    {30x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}
    {15x1 double}
isequal(SB_desired0,SB_desired1)
ans = logical
   1

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Christopher Smith le 15 Fév 2025

Modifié(e) : Christopher Smith le 15 Fév 2025

Ouvrir dans MATLAB Online

I refactored the original code and Stephen23's repsonse to do some timing and it would appear that the response does the trick! Thank you!

% build the two arrays
N = 1e3; % number of cells in the "target" and "source" arrays
TB = cell(N,1);
SB = cell(N,1);
for cnt = 1:N
    TB{cnt} = randi([cnt,N],20,1);
    SB{cnt} = randi([cnt,N],50,1);
end
% do some timing
fun1 = @() InteractionSorting1(TB,SB);
fun2 = @() InteractionSorting2(TB,SB);
t_original = timeit(fun1)
t_original = 21.7708
t_new = timeit(fun2)
t_new = 0.1548
% original method
function [TB_desired,SB_desired] = InteractionSorting1(TB,SB)
% build the new sorted array
NTB = length(TB);
out1 = [];
for cnt = 1:NTB
    for cnt2 = 1:length(TB{cnt})
        out1 = [out1;repmat(TB{cnt}(cnt2),size(SB{cnt},1),1),SB{cnt}];
    end
end
% pull out of the sorted array the desired arrays
TB_desired = unique(out1(:,1));
NTB_desired = length(TB_desired);
SB_desired = cell(NTB_desired,1);
for cnt = 1:NTB_desired
    logvar = TB_desired(cnt) == out1(:,1);
    SB_desired{cnt} = out1(logvar,2);
end
end
% new function proposed by Stephen23
function [tgx,out] = InteractionSorting2(tgt,src)
% Preallocate the intermediate array for better performance
len = sum(cellfun(@numel,tgt) .* cellfun(@numel,src));
arr = zeros(len,2);
% Build the intermediate array more efficiently
idx = 1;
for ii = 1:numel(tgt)
    val = tgt{ii}(:);  % Ensure column vector
    dat = src{ii};
    % Calculate indices for current block:
    num = numel(val) * numel(dat);
    pos = idx:(idx+num-1);
    % Create pairs using vectorized operations:
    [tmp,raw] = ndgrid(val,dat);
    arr(pos,:) = [tmp(:),raw(:)];
    %
    idx = idx + num;
end
% Create final mapping:
[tgx,~,ids] = unique(arr(:,1));
out = accumarray(ids,arr(:,2),[],@(x){x});
end

Christopher Smith le 5 Avr 2025

This answer has been super helpful, but the numbers of elements that need to be sorted are much larger than 1e3. This code is much faster than the original, but still is the bottleneck in the larger code that uses it. Is there a faster way to accomplish this task? I've thought about starting a new question, but figure I would post a comment to this answer first. Thanks!

Stephen23 le 6 Avr 2025

https://www.mathworks.com/help/matlab/call-mex-file-functions.html

Connectez-vous pour commenter.

Fast reorganization of cell arrays

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

Fast reorganization of cell arrays

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

3 commentaires Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Plus de réponses (0)

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien