Grouping identical matrices in cell array

2 vues (au cours des 30 derniers jours)
Karsten Paul
Karsten Paul le 31 Mai 2021
Commenté : Karsten Paul le 31 Mai 2021
Consider two cell arrays (here of size 6x1), where each entry contains a matrix, e.g.
a = { [1 2; 3 4] [2 2 3; 2 2 3] [1 2; 3 4] [1 2; 3 4] [2 3 2; 3 4 5] [2 2 3; 2 2 3] }';
b = { [5 6; 7 8] [2 2 3; 2 2 3] [9 9; 9 9] [5 6; 7 8] [2 3 2; 3 4 5] [2 2 3; 2 2 3] }';
I want to find an array, which assigns a group to each of the six entries, i.e.
groups = [1 2 3 1 4 2];
A group is defined by identical a{i} and b{i} entries (or up to a tolerance). I came up with the following brute-force code
n = length(a);
groups = zeros(n,1);
counter = 0;
for i = 1:n
if groups(i)~=0
continue;
end
counter = counter + 1;
groups(i) = counter;
for j = i+1:n
if groups(j)~=0
continue;
end
if isequal(a{i},a{j}) && isequal(b{i},b{j})
groups(j) = counter;
end
end
end
which is quite inefficient due to the for-loops. Is there a smarter way of finding these groups? Thanks :)
  1 commentaire
Stephen23
Stephen23 le 31 Mai 2021
"which is quite inefficient due to the for-loops"
I doubt that the loops themselves are consuming much time. Have you run the profiler?

Connectez-vous pour commenter.

Réponse acceptée

Jan
Jan le 31 Mai 2021
Modifié(e) : Jan le 31 Mai 2021
Start with a simplified version of your code:
n = numel(a);
groups = zeros(n, 1);
counter = 0;
for i = 1:n
if groups(i) == 0
counter = counter + 1;
groups(i) = counter;
for j = i+1:n
if groups(j) == 0 && isequal(a{i}, a{j}) && isequal(b{i}, b{j})
groups(j) = counter;
end
end
end
end
Now let's assume the cell arrays a and b are huge, e.g. 1e6 elements. Then comparing 1e6 with 1e6-1 elements takes a lot of time. It might be cheaper to create a hash at first:
Hash = cell(1, n);
for k = 1:n
Hash{k} = GetMD5({a{k}, b{k}}, 'Array', 'bass64');
end
[~, ~, groups] = unique(Hash, 'stable');
% With 1e6 elements per cell, R2018b, Win10:
% Elapsed time is 21.341034 seconds. % Original
% Elapsed time is 21.286879 seconds. % Cleaned
% Elapsed time is 6.252804 seconds. % Hashing
  1 commentaire
Karsten Paul
Karsten Paul le 31 Mai 2021
Great, works perfectly and considerably faster. My cell arrays are indeed quite huge, between 1e4 and 1e6 elements. Thanks :)

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Loops and Conditional Statements dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by