count occurrences of string in a single cell array (How many times a string appear)

186 vues (au cours des 30 derniers jours)
I have a single cell array containing long string as shown bellow:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
I am trying to achieve output in two cell array as shown:
xx = {'computer', 'car', 'bus', 'tree'}
occ = {'2', '2','1','1'}
Your suggestion and ideas are highly appreciated. Thanx in advance

Réponse acceptée

Azzi Abdelmalek
Azzi Abdelmalek le 12 Fév 2014
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
a=unique(xx,'stable')
b=cellfun(@(x) sum(ismember(xx,x)),a,'un',0)

Plus de réponses (4)

Jos (10584)
Jos (10584) le 12 Fév 2014
A faster method and more direct method of counting using the additional output of UNIQUE:
XX = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
[uniqueXX, ~, J]=unique(XX)
occ = histc(J, 1:numel(uniqueXX))
  6 commentaires
Adam Danz
Adam Danz le 29 Août 2020
Modifié(e) : Adam Danz le 29 Août 2020
For the carsmall data used in the other comparisons, histc was actually 1.33x faster than histcounts in r2019b and 1.22 x faster on r2020a (matlab online). On both machines I repeated the 10000-rep analysis 3 times and the final results were all within +/-0.02 of what's reported.
The difference between those numbers and your results may have to do with first-time-costs if you're just measuring the execution once with tic/toc.
I like your dedication to optimization! 😎
Bruno Luong
Bruno Luong le 29 Août 2020
Modifié(e) : Bruno Luong le 29 Août 2020
No first-time cost I ensure you. I post just one result ans snipet for simplicity, but I ran tic/toc on loop and within function and on 2 different computers (Windows 8.1 Windows 10 both with R2020a).
The conclusion on my side doesn't not change.
Yeah I'm kind of obssesing with Matlab speed, and I can't hide it.

Connectez-vous pour commenter.


MSchoenhart
MSchoenhart le 27 Sep 2018
Modifié(e) : Adam Danz le 29 Août 2020
A very fast and simple vectorized method is to use categories (since R2013b). "countcats" is also using histc in the background but the code looks much cleaner:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
c = categorical(xx);
categories(c)
countcats(c)
  2 commentaires
Adam Danz
Adam Danz le 29 Août 2020
Modifié(e) : Adam Danz le 29 Août 2020
*Edited question to format code
Nice solution!
Giuseppe Degan Di Dieco
Giuseppe Degan Di Dieco le 27 Avr 2021
Dear MSchoenhart,
thanks for your solution, it helped me too.
Best!

Connectez-vous pour commenter.


Bruno Luong
Bruno Luong le 29 Août 2020
Modifié(e) : Bruno Luong le 29 Août 2020
[yy,~,i] = unique(xx,'stable');
count = accumarray(i(:),1,[numel(yy),1]);
  1 commentaire
Adam Danz
Adam Danz le 29 Août 2020
+1
Just as fast if not a tad faster than Jos' already-super-fast solution.

Connectez-vous pour commenter.


Girish Chandra
Girish Chandra le 12 Fév 2017
Modifié(e) : Adam Danz le 29 Août 2020
Not using histc function you can do it in the following way
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
U=unique(xx)
A=zeros(1,numel(U))
for i=1:numel(U)
for j=1:numel(xx)
if strcmp(U(i),xx(j))==1
A(i)=A(i)+1
end
end
end
  3 commentaires
Jon Adsersen
Jon Adsersen le 8 Avr 2020
Based on the answer by Jos, a function that works for both numerical and string arrays could be formulated:
function [rep_values, N_rep, ind_rep] = f_reapeated_elements(A)
% Find repeated elements in A (can be both numeric or cell strings etc.)
% Outputs:
% rep_values - repeated values in A (occuring 2 or more times)
% N_rep - Number of repetitions of the values given in "rep_values"
% ind_rep - Ind in A of repeated values (occuring 2 or more times)
[un, ~, ind_un] = unique(A) ;
N_A = histc(ind_un,1:numel(un)) ;
rep_values = un(N_A>1) ;
N_rep = N_A(N_A>1) ;
ind_cell = cell(1, numel(rep_values)) ;
A_list = 1:numel(A) ;
for k = 1:numel(rep_values)
if isnumeric(rep_values)
ind_cell{k} = find(A == rep_values(k)) ;
else
log_ind = strcmp(A,rep_values(k)) ;
ind_cell{k} = A_list(log_ind) ;
end
end
ind_rep = unique([ind_cell{:}]) ;
Adam Danz
Adam Danz le 29 Août 2020
*Answer edited to correctly format code

Connectez-vous pour commenter.

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by