Group mean for repeated samples

Question

Colin Edgar le 3 Nov 2015

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/252435-group-mean-for-repeated-samples

Commenté : Colin Edgar le 4 Nov 2015

Have a looping sample with index 1-5 through time, I need averages for each period and index (not simple group-wise mean of id 1-5). Complicated by occasional non-even sample numbers, i.e. usually there are 2 data points for each id, but occasionally 3 and also by the hh and mm time stamp not changing in step with id, i.e. the same minute could exist for several id's. Data like so (actual dataset larger and more complicated):

yyyy  mm  dd  hh  mm  rec  id  data1  data2
5  16  12  54  65433  1  402.2262  3.598117
5  16  12  54  65434  1  401.9153  3.596588
5  16  12  54  65435  2  401.9349  3.597093
5  16  12  54  65436  2  402.1344  3.598488
5  16  12  54  65437  3  401.6603  3.599219
5  16  12  55  65438  3  402.3899  3.596839
5  16  13  3  65439  4  401.6629  3.574918
5  16  13  3  65440  4  401.9409  3.575383
5  16  13  3  65441  5  402.3803  3.574625
5  16  13  3  65442  5  402.174  3.571742
5  16  13  24  65443  1  401.9677  3.568859
5  16  3  24  65444  1  401.7614  3.565976

Could use a moving window-type approach:

for i = 1:floor(n/(period)
    means1(i) = mean(T1((i*period-(period-1)):(i*period)));    
    means2(i) = mean(T2((i*period-(period-1)):(i*period)));
...but occasionally there are less than expected numbers in the period.

This is nice solution if I wanted all ID 1 etc means:

[ii,jj] = ndgrid(id,1:size(data,2));
bb = accumarray([ii(:),jj(:)], data(:), [], @mean);
meanout = data - bb(id,:);

So what I want as result is:

id  n  mean1  mean2
2  402.07075  3.5973525
2  402.03465  3.5977905
2  402.0251  3.598029
2  401.8019  3.5751505
2  402.27715  3.5731835
2  401.86455  3.5674175

Any advice appreciated, could not find info for this specific issue. Some clever use of accumarray will probably work, but I could not conquer it.

2 commentaires
Afficher AucuneMasquer Aucune

dpb le 3 Nov 2015

Think it might help to post a sufficient section of the file to illustrate the problem(s) tried to describe. I, at least, got lost trying to figure out what is in the data and isn't and where the time identifiers referred to came from...oh, now I see essentially the same comment in Cyclist's Answer.

Colin Edgar le 3 Nov 2015

I have posted better data sorry about that....

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Guillaume le 3 Nov 2015

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/252435-group-mean-for-repeated-samples#answer_198362

Ouvrir dans MATLAB Online

It looks like you can't use the date to generate (with unique as the cyclist suggested) a key for accumarray. But since your data is sorted, really, the only thing you want is to find where each period starts. That's easily done, that's when the diff of id is negative. So:

period = cumsum([0; diff(id)<0]);  %asuming id is a column vector
[pid, ~, subs] = unique([period id], 'rows');
mean1 = accumarray(subs, data1, [], @mean);
mean2 = accumarray(subs, data2, [], @mean);
count = accumarray(subs, ones(size(subs)));
result = array2table([pid, count, mean1, mean2], 'VariableNames', {'period', 'id', 'count', 'mean1', 'mean2'})

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Guillaume le 4 Nov 2015

Ouvrir dans MATLAB Online

If you have used the exact same table format for output as I've shown in my example, it's dead easy to convert the tall table into the wide format you want with unstack:

result = unstack(result, {'mean1', 'mean2'}, 'id')

You may want to use the optional NewDataVariableNames to rename the columns, e.g:

result = unstack(result, {'mean1', 'mean2'}, 'id', 'NewDataVariableNames', ...
   arrayfun(@(id) sprintf('id%d', id), unique(result.id), 'UniformOutput', false))

Colin Edgar le 4 Nov 2015

Thank you for being so helpful, I appreciate it :)

Connectez-vous pour commenter.

Answer 2

the cyclist le 3 Nov 2015

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/252435-group-mean-for-repeated-samples#answer_198308

I don't fully understand how to identify the rows that should be grouped, but it seems that if you can figure that out, then you can use accumarray.

One possibility that comes to mind is to use the unique function (with the "rows" argument) on your original data. You identify the ids, times, or whatever that are unique, and should be aggregated.

The third argument of unique is an index from the unique values back to the original, and can be used as the first argument of accumarray, telling it exactly which rows to aggregate.

Make sense? My answer here is an example of the technique. In that example I only needed to look at one column to find the unique values, but the concept would extend to your case where multiple values are required to identify the "key".

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Colin Edgar le 3 Nov 2015

Ouvrir dans MATLAB Online

Great thank you for the guidance! I looked at your answer to the similar problem. I can see that:

[uniqueID,~,idx] = unique(mydata(:,7));
gets:  uniqueID = [1,2,3,4,5]
       idx[1,1,2,2,3,3,4,4,5,5,1,1]

I still can't get the means the way I want with idx. See in my updated example data, I already have a unique sequence number, maybe can use that.

Connectez-vous pour commenter.

Group mean for repeated samples

2 commentaires
Afficher AucuneMasquer Aucune

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

Group mean for repeated samples

2 commentaires Afficher AucuneMasquer Aucune

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (1)

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Voir également

Catégories

Tags

Community Treasure Hunt

2 commentaires
Afficher AucuneMasquer Aucune

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens