combine data into hourly-based data

Question

ahmad Saad le 21 Août 2023

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2011082-combine-data-into-hourly-based-data

Commenté : ahmad Saad le 21 Août 2023

Réponse acceptée : Voss

data.mat

Ouvrir dans MATLAB Online

Col2 is a time column and i need to classify the attached data to be hourly-based data.

for example:

for 0 < col2 <=1 get the median of corresponding values of col3 (and col4)

for 1 < col2 <=2 get the median of corresponding values of col3 (and col4)

.

for 23 < col2 <=24 get the median of corresponding col3 (and col4)

so, i get a matrix of three columns:

c4= [i median(col3) median(col4)];

where i =1:24

my trial:

for i=1:24
    id=find(data(:,2)>i-1 & data(:,2)>i)
    m1(i)= median(data(id,3));
    m2(i)= median(data(id,4));
    c4(i,[1:3])=[i m1(i) m2(i)];
end

Any help

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Voss le 21 Août 2023

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2011082-combine-data-into-hourly-based-data#answer_1290702

Ouvrir dans MATLAB Online

data.mat

load data.mat

First, your approach, modified:

for i=1:24
    id=find(data(:,2)>=i-1 & data(:,2)<i);
    m1(i)= median(data(id,3));
    m2(i)= median(data(id,4));
    c4(i,[1:3])=[i m1(i) m2(i)];
end
disp(c4)
0000    2.0049    2.8200
0000    1.6326    3.1700
0000    0.9196    3.1550
0000       NaN       NaN
0000       NaN       NaN
0000    0.9596    1.2100
0000       NaN       NaN
0000    1.9756    4.3400
0000       NaN       NaN
0000       NaN       NaN
0000       NaN       NaN
0000    4.6718   10.4350
0000       NaN       NaN
0000       NaN       NaN
0000       NaN       NaN
0000    6.1635    6.9350
0000    4.5366    6.6450
0000       NaN       NaN
0000       NaN       NaN
0000    2.1452    5.4050
0000    2.2494    4.8300
0000    1.9169    3.5600
0000    2.0172    2.9000
0000       NaN       NaN

Another approach to calculate the medians for each hour:

hr = discretize(data(:,2),0:24);
[g,g_id] = findgroups(hr);
meds = splitapply(@(x)median(x,1),data(:,[3 4]),g);
disp(meds);
0049    2.8200
6326    3.1700
9196    3.1550
9596    1.2100
9756    4.3400
6718   10.4350
1635    6.9350
5366    6.6450
1452    5.4050
2494    4.8300
9169    3.5600
0172    2.9000

Then, if you don't want the final result to include medians for hours where there is no data:

c4 = [g_id meds];
disp(c4);
0000    2.0049    2.8200
0000    1.6326    3.1700
0000    0.9196    3.1550
0000    0.9596    1.2100
0000    1.9756    4.3400
0000    4.6718   10.4350
0000    6.1635    6.9350
0000    4.5366    6.6450
0000    2.1452    5.4050
0000    2.2494    4.8300
0000    1.9169    3.5600
0000    2.0172    2.9000

Or if you do want to include those NaN medians:

c4 = NaN(24,3);
c4(:,1) = 1:24;
c4(g_id,[2 3]) = meds;
disp(c4);
0000    2.0049    2.8200
0000    1.6326    3.1700
0000    0.9196    3.1550
0000       NaN       NaN
0000       NaN       NaN
0000    0.9596    1.2100
0000       NaN       NaN
0000    1.9756    4.3400
0000       NaN       NaN
0000       NaN       NaN
0000       NaN       NaN
0000    4.6718   10.4350
0000       NaN       NaN
0000       NaN       NaN
0000       NaN       NaN
0000    6.1635    6.9350
0000    4.5366    6.6450
0000       NaN       NaN
0000       NaN       NaN
0000    2.1452    5.4050
0000    2.2494    4.8300
0000    1.9169    3.5600
0000    2.0172    2.9000
0000       NaN       NaN

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Answer 2

Dyuman Joshi le 21 Août 2023

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2011082-combine-data-into-hourly-based-data#answer_1290712

Ouvrir dans MATLAB Online

data.mat

There's no need of using find in the for loop

load('data.mat')
for i=1:24
    %Comparison was incorrect 
    id=data(:,2)>=i-1 & data(:,2)<i;
    m1(i)= median(data(id,3));
    m2(i)= median(data(id,4));
    c4(i,[1:3])=[i m1(i) m2(i)];
end
disp(c4)
    1.0000    2.0049    2.8200
    2.0000    1.6326    3.1700
    3.0000    0.9196    3.1550
    4.0000       NaN       NaN
    5.0000       NaN       NaN
    6.0000    0.9596    1.2100
    7.0000       NaN       NaN
    8.0000    1.9756    4.3400
    9.0000       NaN       NaN
   10.0000       NaN       NaN
   11.0000       NaN       NaN
   12.0000    4.6718   10.4350
   13.0000       NaN       NaN
   14.0000       NaN       NaN
   15.0000       NaN       NaN
   16.0000    6.1635    6.9350
   17.0000    4.5366    6.6450
   18.0000       NaN       NaN
   19.0000       NaN       NaN
   20.0000    2.1452    5.4050
   21.0000    2.2494    4.8300
   22.0000    1.9169    3.5600
   23.0000    2.0172    2.9000
   24.0000       NaN       NaN
%Now the MATLAB/vectorized approach
%Data
vec=data(:,2);
%Specify the bins
bins = 0:24;
%Discretize into bins with inclusion of the right side
%as described in the problem statement i.e. loweredge < data <= upperedge
idx=discretize(vec,0:24,'IncludedEdge','right');
%Accumulate according to the indices obtained by discretization
%and apply median function to the data
%Specify the output size as a column vector as indices are a column vector as well
%And the number of sets will be 1 less than the number of bins
fun = @(x) accumarray(idx,data(:,x),[numel(bins)-1 1],@median);
%Desired output
out = [(1:24)' fun(3) fun(4)];
disp(out)
    1.0000    2.0049    2.8250
    2.0000    1.6326    3.1750
    3.0000    0.9196    3.1550
    4.0000         0         0
    5.0000         0         0
    6.0000    0.9596    1.2100
    7.0000         0         0
    8.0000    1.9756    4.3400
    9.0000         0         0
   10.0000         0         0
   11.0000         0         0
   12.0000    4.6718   10.4350
   13.0000         0         0
   14.0000         0         0
   15.0000         0         0
   16.0000    6.1635    6.9350
   17.0000    4.5366    6.6450
   18.0000         0         0
   19.0000         0         0
   20.0000    2.1452    5.4000
   21.0000    2.2494    4.8200
   22.0000    1.9169    3.5500
   23.0000    2.0172    2.8900
   24.0000         0         0

The only difference is that the for loop approach yields NaN, where as accumarray approach gives 0, for no values in a particular bin.

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

ahmad Saad le 21 Août 2023

Dyuman Joshi : Thanks for your response

Connectez-vous pour commenter.

combine data into hourly-based data

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (1)

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

combine data into hourly-based data

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (1)

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens