calculate mean and max values using the groupsummary command

Hello everyone,
As you can see from the attached picture I have successfully imported a large table matrix (T2) that shows certain wave motion variables for several years (hourly). I am trying to study the mean and maximum yearly values using "groupsummary" ( more or less like this: T22 = groupsummary(T2,"Var1","year",'mean'); ) but the command doesn't work and I am pretty sure it is because the 3rd and 6th columns have values between apostrophes ( 'x' ). I have run the same command in other situations and it works perfectly fine so I guess that the solution would be to get rid of those apostrophes..
Is there a smart way to do that? Of course I cannot do it manually since the matrix is extremely large.
Thank you in advance for your help and patience.

6 commentaires

You need to conver those char arrays into numbers first. Can you attach a small portion of the data?
Ok let me see.. is it ok like this?
'01/01/2004 00:00' 0.330000000000000 '0.42' 0.540000000000000 7.91000000000000 '151.74'
'01/01/2004 01:00' 0.270000000000000 '0.35' 0.450000000000000 5.40000000000000 '150.88'
'01/01/2004 02:00' 0.260000000000000 '0.32' 0.420000000000000 8.29000000000000 '193.84'
'01/01/2004 03:00' 0.280000000000000 '0.36' 0.460000000000000 7.91000000000000 '138.03'
'01/01/2004 04:00' 0.260000000000000 '0.33' 0.420000000000000 7.34000000000000 '150.61'
'01/01/2004 05:00' 0.280000000000000 '0.35' 0.450000000000000 8.16000000000000 '158.17'
'01/01/2004 06:00' 0.280000000000000 '0.36' 0.460000000000000 8.16000000000000 '168.64'
'01/01/2004 07:00' 0.300000000000000 '0.38' 0.490000000000000 8.03000000000000 '160.52'
@Tiziano Bagnasco: the best solution would be to fix your data importing, so that the numeric data are imported as numeric and not as text. If you want us to help you with that, please upload a sample data file by clicking the paperclip button.
Ok ! So, Basically I have several .csv files ( one for each year) and the first five files have missing values for the 3rd and 6th columns.. I'll show you what I did:
ds = datastore('*.csv');
T = readall(ds); % read and load all csv files
T1 = T(1:33465,:); % I am separating the years that have values in the 3rd and 6th column from those that don't
% now I remove the two columns with '-' values because i couldn't convert
% them to NaN
T1 =removevars(T1,{'WLCH1_10_m_'});
T1= removevars(T1,{'WLCMeanDir_Degree_'});
T2 = T(33466:end,:); % here is the rest of the years with all the info ( but still in 'x' format forn 3rd and 6th column)
G1a = groupsummary(T1,"Var1","year",'mean');
G1b = groupsummary(T1,"Var1","year",'max');
G2a = groupsummary(T2,"Var1","year",'mean');
G2b = groupsummary(T2,"Var1","year",'max');
I am very confused.. let's suppose i only want to convert one element of that table from char array to number, e.g. T2(1,3)= '0.42'
How can i convert that?
And how should i change the import command if i want to make this change since the beginning?
Thank you
Seems your data can be retrieved by using readtable without any issues. Below shows the data from your attached csv file on year 2004.
On the other hand, since your data are already separated on a yearly basis. There is no need to use function groupsummary to determine the mean and maximum values for each year. You can just use function mean or max directly.
data = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1117965/Wave%20Data%202004_Open%20Data_WLC.csv','VariableNamingRule','preserve');
head(data,5)
ans = 5×6 table
Var1 WLC Hm0 (m) WLC H 1/10 (m) WLC Hmax (m) WLC Peak Tp (s) WLC Mean Dir (Degree) ________________ ___________ ______________ ____________ _______________ _____________________ 01/01/2004 00:00 0.33 0.42 0.54 7.91 151.74 01/01/2004 01:00 0.27 0.35 0.45 5.4 150.88 01/01/2004 02:00 0.26 0.32 0.42 8.29 193.84 01/01/2004 03:00 0.28 0.36 0.46 7.91 138.03 01/01/2004 04:00 0.26 0.33 0.42 7.34 150.61

Connectez-vous pour commenter.

Réponses (1)

tt = readtimetable("https://in.mathworks.com/matlabcentral/answers/uploaded_files/1117915/Wave%20Data%201994_Open%20Data_WLC.csv","VariableNamingRule","preserve")
tt = 7357×5 timetable
Time WLC Hm0 (m) WLC H 1/10 (m) WLC Hmax (m) WLC Peak Tp (s) WLC Mean Dir (Degree) ________________ ___________ ______________ ____________ _______________ _____________________ 26/01/1994 18:00 0.23 {'-'} 0.41 8.53 {'-'} 26/01/1994 19:00 0.22 {'-'} 0.31 7.53 {'-'} 26/01/1994 20:00 0.23 {'-'} 0.37 7.53 {'-'} 26/01/1994 21:00 0.15 {'-'} 0.22 8 {'-'} 26/01/1994 22:00 0.19 {'-'} 0.36 4.41 {'-'} 26/01/1994 23:00 0.17 {'-'} 0.22 4.92 {'-'} 27/01/1994 00:00 0.15 {'-'} 0.23 7.11 {'-'} 27/01/1994 01:00 0.16 {'-'} 0.21 6.74 {'-'} 27/01/1994 02:00 0.2 {'-'} 0.32 7.11 {'-'} 27/01/1994 03:00 0.15 {'-'} 0.24 5.82 {'-'} 27/01/1994 04:00 0.16 {'-'} 0.23 6.74 {'-'} 27/01/1994 05:00 0.16 {'-'} 0.3 6.4 {'-'} 27/01/1994 06:00 0.2 {'-'} 0.31 6.4 {'-'} 27/01/1994 07:00 0.28 {'-'} 0.4 7.11 {'-'} 27/01/1994 08:00 0.47 {'-'} 0.85 7.53 {'-'} 27/01/1994 09:00 0.38 {'-'} 0.54 7.11 {'-'}
retime(tt(:,vartype("numeric")),"yearly","mean")
ans = 1×3 timetable
Time WLC Hm0 (m) WLC Hmax (m) WLC Peak Tp (s) ________________ ___________ ____________ _______________ 01/01/1994 00:00 0.32572 0.51003 7.5126
retime(tt(:,vartype("numeric")),"yearly","max")
ans = 1×3 timetable
Time WLC Hm0 (m) WLC Hmax (m) WLC Peak Tp (s) ________________ ___________ ____________ _______________ 01/01/1994 00:00 1.68 2.42 18.286

Catégories

Produits

Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by