Large Cell Array Data Query (USDA FIA data)
Afficher commentaires plus anciens
Hi, I have two large cell array data sets (USDA FIA data). Trying to connect two (Data B to A) using TRE_CN (tree numbers in string, e.g. '212152293031', 212152393031' ...).
I tried two options.
1. for loop and strcmp
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
for i=1:rows_dataA
Qry_mat=strcmp([dataB{:,1}]',dataA{i,1}{1,1});
Fnl_mat(i,6)=dataB(Qry_mat,2);
end
save(filename,'Fnl_mat');
2. getnameidx
idx=getnameidx([dataB{:,1}],[dataA{:,1}]);
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
Fnl_mat(:,6)=dataB(idx,2);
save(filename,'Fnl_mat');
But,,, both options take too much time (10,000 secs) in processes due to large amount of rows (>30,000 for dataA and >600,000 for dataB). How can I solve this problem?
Dataset A
% TRE_CN PLT_CN INVYR SUBP HT
% String String Number Number Number
'291024' '12312' 2009 1 60
'291124' '12312' 2009 1 38
...
...
over 30000 rows
Dataset B
% TRE_CN BIOMASS
% String Number
'220324' 800
'220424' 345
...
...
'291024' 580
'291124' 304
...
...
over 600000 rows
Réponses (1)
SUNGHO
le 25 Sep 2012
0 votes
Catégories
En savoir plus sur Characters and Strings dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!