How to add a new variable to a tall table which depends on information available in other lower size tall table?
5 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I have a tall table called tt_train with one of its variables named store_nbr. I also have other tall table called tt_stores with uniques store_nbr. I want to add a new variable tt_train.store_type whose content is the store type available in the tt_stores. Since tt_train has more rows than tt_stores, the store_type must be matched according to the store_nbr in tt_train.
In a normal table, I would do this:
tbl_train.store_type = NaN;
for i = 1:size(train,1)
tbl_train.store_type(i) = tbl_stores.store_type(tbl_stores.store_nbr == tbl_train.store_nbr(i));
end
Since indexing is not possible for tall tables, I do not how to proceed in this case, and how to save the new tall table in disk.
I am new on big data. I have experienced matlab user.
Thanks for your help.
0 commentaires
Réponses (1)
Edric Ellis
le 12 Déc 2017
You can use a tall table join or method to do this. Here's a simple example. I'm using innerjoin here because my simple info table doesn't have enough rows for all the data.
% Create a tall table
varnames = {'ArrDelay', 'DepDelay', 'Origin'};
ds = datastore('airlinesmall.csv', 'TreatAsMissing', 'NA', ...
'SelectedVariableNames', varnames);
tt = tall(ds);
% Create a non-tall table of information
info = table({'LAX'; 'SJC'; 'BUR'}, [1;2;3], ...
'VariableNames', {'Airport', 'SomeProperty'});
% Use 'innerjoin' to add information
jt = innerjoin(tt, info, 'LeftKeys', 'Origin', 'RightKeys', 'Airport');
% Display results
gather(head(jt))
0 commentaires
Voir également
Catégories
En savoir plus sur Tall Arrays dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!