Matlab find unique column-combinations in matrix and respective index

Question

Benvaulter le 22 Mar 2017

1
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index

Modifié(e) : Jan le 23 Mar 2017

I have a large matrix with with multiple rows and a limited (but larger than 1) number of columns containing values between 0 and 9 and would like to find an efficient way to identify unique row-wise combinations and their indices to then build sums (somehwat like a pivot logic). Here is an example of what I am trying to achieve:

a =

uniqueCombs =

   2     3
   2     3
   2     1

numOccurrences =

 2
 1
 2

indizies:

[1;4]
[2]
[3;5]

From matrix a, I want to first identify the unique combinations (row-wise), then count the number occurrences / identify the row-index of the respective combination.

I have achieved this through generating strings with num2str and strcat, but this method appears to be very slow. Along these thoughts I have tried to find a way to form a new unique number through concatenating the values horizontally, but Matlab does not seem to support this (e.g. from [1;2;3] build 123). Sums won't work because they would remove the possibility to identify unique combinations. Any suggestions on how to best achieve this? Thanks!

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Guillaume le 22 Mar 2017

3
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259890

Ouvrir dans MATLAB Online

More or less the same as Jan's, using accumarray instead of splitapply (I'm still old school!):

A = [ 1     2     3
      2     2     3
      3     2     1
      1     2     3
      3     2     1];
[B, ~, ib] = unique(A, 'rows');
numoccurences = accumarray(ib, 1);
indices = accumarray(ib, find(ib), [], @(rows){rows});  %the find(ib) simply generates (1:size(a,1))'

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Guillaume le 23 Mar 2017

Modifié(e) : Guillaume le 23 Mar 2017

Ouvrir dans MATLAB Online

I suspect that accumarray will be faster as it is built-in compiled code whereas splitapply is m code, but I haven't conducted any test.

Note: for the indices,

indices = accumarray(ib, (1:numel(ib))', [], @(rows){rows});

is probably slightly faster, just not as concise.

Jan le 23 Mar 2017

Modifié(e) : Jan le 23 Mar 2017

Ouvrir dans MATLAB Online

@Guillaume: I compare this with cellfun: In older versions Matlab contained the C-sources for this Mex function. Here calling a function handle is very expensive, because the Matlab tier has to be called. Therefore the implicitely defined methods provided by strings are much faster: 'length', 'isclass' etc.

Then using a compiled Mex function is not a real benefit, because mexCallMATLAB has some overhead. This might concern accumarray also. I guess that your accumarray approach is faster than the loop, but I know that it looks very cryptic ;-)

But now I can leave the speculations and run a test: With

A = randi([1, 100], 1e5, 3); % Test data

my loop takes 14.75 seconds, your accumarray approach takes 0.44 seconds. The results differ in the order of the indices. So perhaps this is wanted:

[B, iB, iA] = unique(A, 'rows');
indices     = accumarray(iA, (1:numel(iA)).', [], @(r){sort(r)});

The result is clear: @Benvaulter, please unaccept my answer and select Guillaume's, and of course use it also to save time and energy.

Connectez-vous pour commenter.

Answer 2

Jan le 22 Mar 2017

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/331309-matlab-find-unique-column-combinations-in-matrix-and-respective-index#answer_259879

Modifié(e) : Jan le 23 Mar 2017

Ouvrir dans MATLAB Online

A = [ 1     2     3; ...
      2     2     3; ...
      3     2     1; ...
      1     2     3; ...
      3     2     1];
[B, iB, iA] = unique(A, 'rows');
G = unique(iA);
numOccurrences = splitapply(@sum, iA, G);

I cannot test a method to obtain the indices list as wanted. I assume this works with splitapply also. A simple loop approach at least:

n = length(G);
indices = cell(1, n);
for k = 1:n
  indices{k} = find(iA == G(k));
end

[EDITED] Code is tested now. Use the much faster solution of Guillaume for productive work.

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Benvaulter le 23 Mar 2017

Perfect solution to my problem - thanks a lot!

Connectez-vous pour commenter.

Matlab find unique column-combinations in matrix and respective index

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

Matlab find unique column-combinations in matrix and respective index

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens