How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

Question

Leon le 19 Oct 2021

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the

Commenté : Leon le 19 Oct 2021

I have a matrix as below:

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7];

My goal is to identify rows with both identical Column 1 and identical Column 3 values, and then calculate the average for the rest of the columns, i.e., Column 2 and Column 4, within these duplicate rows. In this example, the duplicate rows would be Rows # 1, 3, and 5. My ending matrix would be:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3];

This is a much simplified example. In reality, I have 35 columns that need to be averaged, and millions of rows. What is the most efficient way of handling this? Do I have to write a loop and process each of the unique rows individually?

Many thanks!

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Kevin Holly le 19 Oct 2021

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the#answer_812193

Modifié(e) : Kevin Holly le 19 Oct 2021

Ouvrir dans MATLAB Online

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7]
A = 5×4
   4     3     8
   5     6     9
   6     3     6
   6     9     3
   5     3     7

I am going to assume that any row that columns 1 and 3 are identical, irregardless of what pair, you want to ignore those rows in the averaging of other columns.

Here is my approach:

t = table(A(:,1),A(:,3))
t = 5×2 table
    Var1    Var2
    ____    ____

     1       3  
     4       6  
     1       3  
     2       9  
     1       3  
[C, ia, ic] = unique(t,'rows')
C = 3×2 table
    Var1    Var2
    ____    ____

     1       3  
     2       9  
     4       6  
ia = 3×1
     1
     4
     2
ic = 5×1
     1
     3
     1
     2
     1
ic==1
ans = 5×1 logical array
   1
   0
   1
   0
   1
A(ic==1,:)
ans = 3×4
     1     4     3     8
     1     6     3     6
     1     5     3     7
mean(A(ic==1,:))
ans = 1×4
     1     5     3     7
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Your answer:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Without showing work:

t = table(A(:,1),A(:,3));
[~,~,ic] = unique(t,'rows');
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Leon le 19 Oct 2021

Many thanks for the solution!

Connectez-vous pour commenter.

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens