Create a join on two cell-Arrays

1 vue (au cours des 30 derniers jours)
Vincent
Vincent le 22 Sep 2011
Hi,
I've got two cell-arrays:
A = {'id1','H20';
'idc','O2';
'id3','CO2'};
B = {'idc';'id1';'id3';'id1';'id1'};
After running my script, I want B modified that it contains:
B = {'idc','O2';
'id1','H2O';
'id3','CO2';
'id1','H2O';
'id1','H2O'};
Has anyone an idea avoiding loops? Thanks :)

Réponse acceptée

Walter Roberson
Walter Roberson le 22 Sep 2011
[tf, aidx] = ismember(B, A(:,1));
B = [B, A(aidx,2)];
  1 commentaire
Jan
Jan le 22 Sep 2011
Or B=A(aidx, :);

Connectez-vous pour commenter.

Plus de réponses (3)

Vincent
Vincent le 22 Sep 2011
Works like a charm, thanks! This very useful option for ismember is way to hidden :-/
I'm working since two months now with Matlab and used a dirty workaround until now...

Jan
Jan le 22 Sep 2011
ISMEMBER, INTERSECT and SETDIFF are very powerful and optimized to work on huge data sets. But if you operate on cell strings with < 1000 elements, some simple loops are usually much faster:
A = {'id1','H20';
'idc','O2';
'id3','CO2'};
B = {'idc';'id1';'id3';'id1';'id1'};
function B = myFunc(A, B)
index = zeros(numel(B), 1);
for i = 1:size(A, 1)
index(strcmp(B, A{i})) = i; % A{i} == A{i,1}
end
B = A(index, :);
This is 27 times faster than the ISMEMBER approach for your tiny dataset. Therefore I would not call such FOR loops "dirty workaround".
[EDITED] Using the index method instead of the former method to create the column cell directly is 40% faster.

Vincent
Vincent le 22 Sep 2011
Nice, this leveled up my self-esteem :)
but the code takes much more time to be written and doesn't stay readable. And of course, I gave just a tiny example above; my datasets are usually betwenn 600x40 and 2000x40 entries big
But thank you anyway for this hint
  1 commentaire
Jan
Jan le 22 Sep 2011
This function vanishes inside a subfunction such that the actual program contains "B=myFunc(A,B)" only, which is easy to read.
I'd be very interested in a speed comparison of the two methods for the larger dataset. I modify my example to work for cells with 40 columns in a few minutes.
I've converted an equivalent function into a C-Mex: http://www.mathworks.com/matlabcentral/fileexchange/24380-cstrainbp

Connectez-vous pour commenter.

Catégories

En savoir plus sur Structures dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by