Effacer les filtres
Effacer les filtres

calculating the mean for each column in a numerical array based on the elements in column 1

5 vues (au cours des 30 derniers jours)
I have a numerical array (8167x11). The first column has numbers from 1 to 198 in ascending order(each number is repeated several times, the number of repetitions of each is random however they are sorted in ascending order). I need to calculate the mean of the numbers in each column seperately (2 to 11) that correspond to each number in column 1. So, the output must be an array 198x11 where column 1 contains the numbers 1:198 and each of the other columns contain the means of the numbers corresponding to each element in column 1.
  1 commentaire
Ziad Sari El Dine
Ziad Sari El Dine le 27 Mai 2022
There are a few numbers missing between 1 and 198 in column 1. Is there a way to fill in the gaps with the missing numbers and having the rest of the row filled with nan's or zeros?

Connectez-vous pour commenter.

Réponse acceptée

Jan
Jan le 27 Mai 2022
Modifié(e) : Jan le 27 Mai 2022
With a simple loop:
A = [randi([1, 198], 8167, 1), rand(8167, 10)];
result = zeros(198, 11);
for k = 1:198
match = A(:, 1) == k;
result(k, :) = mean(A(match, :), 1);
end
This takes about the same time as splitapply. A faster appraoch:
% Sort A according to first element:
[~, ind] = sort(A(:, 1));
B = A(ind, :);
% Determine, where the elements in the first row change:
d = [true, diff(B(:, 1)).' ~= 0, true]; % TRUE at changes
c = find(d); % Indices where block change
% Loop over keys:
result = zeros(198, 11);
for k = 1:198
nk = c(k+1) - c(k); % Number of same keys
% Mean over block with same keys:
result(k, :) = sum(B(c(k):c(k+1)-1, :), 1) / nk;
end
For a test data set:
A = [randi([1, 198], 8167, 1), rand(8167, 10)];
this needs about 0.0019 seconds, while splitapply needs 0.0066 seconds (Matlab R2018b).
Note: sum(X,1) / nX is faster than mean(X,1).

Plus de réponses (1)

Matt J
Matt J le 27 Mai 2022
Modifié(e) : Matt J le 27 Mai 2022
Let's call your matrix A. Then,
out = splitapply(@(z) mean(z,1),A,A(:,1));
  3 commentaires
Jan
Jan le 27 Mai 2022
If one of the groups contains 1 row only, mean operates on the 2nd dimension automatically. So to be sure specify the dimension to build the mean over:
A = [1, 2, 3, 4; ...
1, 5, 6, 7; ...
2, 1, 1, 1; ...
1, 4, 2, 1];
out = splitapply(@(x) mean(x, 1), A, A(:,1))
out = 2×4
1.0000 3.6667 3.6667 4.0000 2.0000 1.0000 1.0000 1.0000
Matt J
Matt J le 27 Mai 2022
Modifié(e) : Matt J le 27 Mai 2022
Is there a way to fill in the gaps with the missing numbers
Do you really need/want the gaps filled in? If you exclude the missing numbers, the modification is easy:
out = splitapply(@(z) mean(z,1),A,findgroups( (A(:,1) ));
If you must have the gaps filled in, it's a few additional steps:
out_with_nans=nan(198,11);
out_with_nans(round(out(:,1)),:)=out;
out_with_nans(:,1)=1:198;

Connectez-vous pour commenter.

Catégories

En savoir plus sur Matrix Indexing dans Help Center et File Exchange

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by