For loop only working/filling cell array for half of data

2 vues (au cours des 30 derniers jours)
Claudia
Claudia le 11 Nov 2022
Modifié(e) : Karim le 12 Nov 2022
I am trying to use a for loop to fill a cell array containing tables with various statistics (e.g. mean, median ...) for sites within a large dataset.
The aim is to end up with a cell array 1x42, with a table for each variable.
The loop seems to only work for the first 16 variables. The remaining tables are empty. However, if I run the same loop specifiying a single variable (eg. i = 20), the code works and that output gives a filled table.
Code and input data are attached.
clear variables; clc; load x.mat;
for i = 1:(size(x,2))
x = x(~isnan(table2array(x(:,i))),:);
[site_num,ia,obs_count] = unique(x.site_num,'sorted');
ans_mean = accumarray(obs_count,table2array(x(:,i)),[],@(x)mean(x,'omitnan')); ans_mean = [array2table(ans_mean)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_mean.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_mean = renamevars(ans_mean,'ans_mean',header);
ans_median = accumarray(obs_count,table2array(x(:,i)),[],@(x)median(x,'omitnan')); ans_median = [array2table(ans_median)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_median.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_median = renamevars(ans_median,'ans_median',header);
ans_std = accumarray(obs_count,table2array(x(:,i)),[],@(x)std(x,'omitnan')); ans_std = [array2table(ans_std)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_std.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_std = renamevars(ans_std,'ans_std',header);
ans_lq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.25)); ans_lq = [array2table(ans_lq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_lq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_lq = renamevars(ans_lq,'ans_lq',header);
ans_uq = accumarray(obs_count,table2array(x(:,i)),[],@(x)quantile(x,0.75)); ans_uq = [array2table(ans_uq)];
txt1 = x(:,i).Properties.VariableNames; txt2 = ans_uq.Properties.VariableNames; header = strcat(txt1,{'_'},txt2); ans_uq = renamevars(ans_uq,'ans_uq',header);
obs_count = array2table(accumarray(obs_count,1)); txt1 = x(:,i).Properties.VariableNames; header = strcat(txt1,{'_'},{'obs_count'}); obs_count = renamevars(obs_count,'Var1',header);
all{i} = [array2table(site_num) ans_mean ans_median ans_std ans_lq ans_uq obs_count];
end
Any thoughts/help/tips would be greatly appreciated! Thank you!
Apologies if my code is quite inefficient, I'm still in the learning process :)
  2 commentaires
Stephen23
Stephen23 le 11 Nov 2022
Simplify your code by replacing these:
table2array( x(:,1) )
with
x{:,1}
Claudia
Claudia le 11 Nov 2022
Thanks so much for the tip Stephen! I will make sure to that in the future :)

Connectez-vous pour commenter.

Réponse acceptée

Karim
Karim le 11 Nov 2022
Modifié(e) : Karim le 12 Nov 2022
One issue was the reuse of the variable name "x" directly after entering the loop, you overwrite your orinal data by removing elements with a nan. After a few loops you are left with no data.
It's better to create a temporary variable, I called it "currData" to extract the data on which your are working in the current loop. I shortend the code a bit and added a few comments.
% load mat file
load(websave('myFile', "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1189013/x.mat"));
% allocate a cell array for the output data
AllData = cell(1,size(x,2));
for i = 1:size(x,2)
% extract data for current loop, and convert to array
% EDIT: included Stephen23's proposal to extract the data
currData = x{:,i};
% figure out which values are a number
NumIdx = ~isnan( currData );
% only keep the numbers for further processing
currData = currData(NumIdx);
% sort the "site num" for the numbers in tha array
[site_num,~,obs_count] = unique(x.site_num(NumIdx) ,'sorted');
% get the name of the current variable
currVarName = x(:,i).Properties.VariableNames + "_";
% do the processing
ans_mean = accumarray(obs_count,currData,[],@(x)mean(x,'omitnan'));
ans_median = accumarray(obs_count,currData,[],@(x)median(x,'omitnan'));
ans_std = accumarray(obs_count,currData,[],@(x)std(x,'omitnan'));
ans_lq = accumarray(obs_count,currData,[],@(x)quantile(x,0.25));
ans_uq = accumarray(obs_count,currData,[],@(x)quantile(x,0.75));
% create the table names for the current variable
varNames = [ currVarName + "site_num";
currVarName + "ans_mean";
currVarName + "ans_median";
currVarName + "ans_std";
currVarName + "ans_lq";
currVarName + "ans_uq"
currVarName + "obs_count";];
% gather the data in a table
currTable = table(site_num, ans_mean, ans_median, ans_std, ans_lq, ans_uq, accumarray(obs_count,1),...
'VariableNames',varNames);
% store the table in the output cell array
AllData{i} = currTable;
end
% have a look at the data in the output cell
AllData
AllData = 1×42 cell array
{130×7 table} {130×7 table} {130×7 table} {130×7 table} {92×7 table} {76×7 table} {57×7 table} {104×7 table} {99×7 table} {53×7 table} {130×7 table} {104×7 table} {67×7 table} {102×7 table} {98×7 table} {45×7 table} {26×7 table} {26×7 table} {18×7 table} {68×7 table} {81×7 table} {69×7 table} {27×7 table} {62×7 table} {9×7 table} {0×7 table} {66×7 table} {66×7 table} {65×7 table} {65×7 table} {15×7 table} {8×7 table} {46×7 table} {27×7 table} {51×7 table} {29×7 table} {51×7 table} {29×7 table} {29×7 table} {17×7 table} {16×7 table} {9×7 table}
  1 commentaire
Claudia
Claudia le 11 Nov 2022
Thank you soooooo much Karim! You have literally saved the day :)
Thank you for your detailed, thoughtful and super helpful answer! I really appreciate it!

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Produits


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by