Hello everyone,
I'm going to calculate the monthly average of daily temperatures grouping by sensor using retime function. At the moment I'm trying to select each sensor with a loop and then apply the retime function, but I have to calculate six million rows so I would like to avoid a loop to speed up the calculation
I give an example:
Time = datetime({'18-feb-2021';'19-feb-2021';...
'01-mar-2021';'04-mar-2021';'18-feb-2021';...
'19-feb-2021';'01-mar-2021';'04-mar-2021'});
Temp = [56.82;62.72;64.52;63.81;63.45;59.7;60.27;61.32];
Sensor = [12;12;12;12;13;13;13;13];
TT = timetable(Time,Sensor,Temp);
Current code:
uni_sensor = unique(Sensor);
monthly_ds = timetable;
parfor kk = 1:length(uni_sensor)
index_retime = find(TT.Sensor == uni_sensor(kk));
sensor_retime = TT(index_retime,:);
monthly_data = retime(sensor_retime(:,2), 'monthly', 'mean');
data_sensor_retime = array2table(repmat(uni_sensor(kk), size(monthly_data, 1),1), 'VariableNames', "Sensor" );
monthly_sub_id = [monthly_data, data_sensor_retime];
monthly_ds = [monthly_ds; monthly_sub_id];
end
Desired output:
Time = datetime({'feb-2021';'feb-2021';...
'mar-2021';'mar-2021'});
Temp = [59.77;61.575;64.165;60.795];
Sensor = [12;12;13;13];
TT_out = timetable(Time,Sensor,Temp);
Thanks in advance,
Gianluca

 Réponse acceptée

Marco Riani
Marco Riani le 27 Avr 2021

2 votes

I think in this example it is unnecessary to use retime.
I would proceed as follows.
Time1=char(Time);
Time2=Time1(:,4:end);
TT = table(findgroups(string(Time2)),Sensor,Temp);
groupvars={'Sensor' 'Var1'};
datavars='Temp';
groupsummary(TT,groupvars,'mean',datavars)
Instead of using groupsummary it is possible to use grpstats. Please let us know which between groupsummary and grpstats is faster.

1 commentaire

Gian23
Gian23 le 27 Avr 2021
Very interesting approach and usefull overall! I tried yor solution and it works very well.
On the evidence of one test, grpstats is faster than groupsummary.
Many thanks again!

Connectez-vous pour commenter.

Plus de réponses (1)

Eric Sofen
Eric Sofen le 4 Mai 2021

0 votes

Another approach is to unstack the timetable based on the sensor ID, so you'd have a wide timetable with separate variables temp_12, temp_13, ..., then apply retime to that without a need for grouping. I don't know if it would be faster than Marco's findgroups approach (which is quite clever), and having the sensor IDs embedded in the table variable names may or may not be useful in the long run, but it's yet another way to tackle this problem.

Produits

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by