Using groupsummary with Floating Number of Columns to Analyze
4 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hello,
I'm planning on using the groupsummary function to analyze different groups of data at various timesteps (columns).
The number of timesteps will vary, so I don't want to hard code in a set number.
My first issue that I'm having came whenever I changed the name of the columns that contain the data I want to analyze.
After importing into MATLAB, the column headers were originally just numbers (0, 1, 2, and 3), so MATLAB gave it generic 'VarName3', 'VarName4', 'VarName5', and 'VarName6'. (Columns 1 and 2 aren't needed)
I changed these to be the original numbers by extracting data from the Excel file and converting the numbers to strings (using num2str).
When the column headers were the VarName, the groupsummary function worked (code below):
G = groupsummary(table,'groupnames',"sum",{'VarName3','VarName4'});
But now that I changed it to be reflective of the timestep in quesiton, it gives the error 'Invalid data variable'.
G = groupsummary(table,'groupnames',"sum",{'0','1'});
I'm not sure why just changing the column headers would cause an error.
If anyone has any insight, I would appreciate it.
My second question is how to have the number of columns to analyze be dependent on the imported table?
I was thinking something like this, where the first column to analyze is always the 3rd column, and the very last column is never analyzed, and N is the total number of columns.
for k=3:N-1
G = groupsummary(table,'groupnames',"sum",table(:,k));
end
As I understand it, the part in the function where you state which columns you want to analyze must be written as a string value, so something in the apostrophe.
With that in mind, I'm not sure how to make it float, since, at least as far as I'm aware, you can't put variable (k) in apostrophes.
Any help is appreciate, let me know if you'd like more detail, or I can split this question up into two separate posts.
Thanks.
2 commentaires
Cris LaPierre
le 25 Avr 2023
Please attach your file, or save your variables to a mat file and attach that to your post using the paperclip icon.
Réponse acceptée
Cris LaPierre
le 25 Avr 2023
There is nothing wrong with your first code snippet. The error is simply telling you that your table does not have a variable (column) named '0' or '1'. Check that you have properly renamed the table variables.
Health = {'Fair';'Poor';'Excellent';'Good';'Fair'};
Height = [71;69;64;67;64];
Weight = [176;163;131;133;119];
BloodPressure = [124 93; 109 77; 125 83; 117 75; 122 80];
T1 = table(Health,Height,Weight,BloodPressure,'VariableNames',{'groupnames','0','1','2'});
G = groupsummary(T1,'groupnames',"sum",{'0','1'})
For your second question, the code you are proposing won't work well. You'd only ever end up with the final results, as each loop will overwrite the previous one. There is also no need to put this in a loop. Just use the colon operator and width as shown here:
G = groupsummary(T1,'groupnames',"sum",2:width(T1)-1)
4 commentaires
Cris LaPierre
le 25 Avr 2023
Modifié(e) : Cris LaPierre
le 25 Avr 2023
It looks like that is an undocumented syntax. Use this instead: "VariableNamingRule","preserve".
You can read more here: https://www.mathworks.com/help/matlab/ref/readtable.html#mw_47bd1f02-32a6-44af-aedd-70395ba4c587
I also prefer strings over character arrays, so I would ultimately do this.
data = readtable('table.xlsx','ReadVariableNames',true,'VariableNamingRule','preserve',...
'TextType','string')
G1 = groupsummary(data,'groupnames',"sum",{'0','1'})
G2 = groupsummary(data,'groupnames',"sum",3:width(data)-1)
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Structures dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!