Deleting duplicates based on conditions of multiple columns

Question

Nick le 28 Déc 2020

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns

Réponse apportée : Akash kumar le 31 Juil 2022

Hi,

I have a large dataset (100m rows x 40 columns ) and I would like to delete any row that has duplicates on a few specific columns. See example below:

A = [1 10 4; 1 10 4; 1 11 5; 1 11 5; 1 12 6; 1 12 7; 1 13 8; 2 4 25; 2 10 28; 2 10 28; 3 5 33; 4 25 23; 4 23 24];

I would like to delete all rows where the three columns have duplicate within each specific column. So in this example, row 2, 4 and 9 would be deleted because e.g.

row 1 and 2 have duplicates in each of the three columns and so I'd want to delete one of the two (doesn't matter which one).

I suspect the answer is somewhere along the use of unique and logical indexing but haven't managed to figure it out. Any help would be much appreciated. (I'm using Matlab 2018b)

Thanks

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Nick le 28 Déc 2020

Thanks for this but unfortunately, this would work for this sample only I think. The actual dataset has 40 columns and i'd like to remove the rows based on the dupicates of 3 columns only, rather than all.

Nick le 28 Déc 2020

Ouvrir dans MATLAB Online

Just found the answer. This way you can find the unique rows amongst a number of columns (in this case, columns 1, 2 and 3) and then produce the original table without the duplicate values.

[C,ia] = unique(A(:,1:3),'rows')
A_new = A(ia,:)

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Nick le 28 Déc 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns#answer_586042

[C,ia] = unique(A(:,1:3),'rows')

A_new = A(ia,:)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Answer 2

Akash kumar le 31 Juil 2022

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns#answer_1018540

Ouvrir dans MATLAB Online

% With Index Number:- Shows the which index or Row value is extract from
% the A Matrix. I thinks, It can help you.
A = [1 10 4; 1 10 4; 1 11 5; 1 11 5; 1 12 6; 1 12 7; 1 13 8; 2 4 25; 2 10 28; 2 10 28; 3 5 33; 4 25 23; 4 23 24]';
[B index]=unique(AA(1:3,:).','rows', 'stable')
B = 10×3
     1    10     4
     1    11     5
     1    12     6
     1    12     7
     1    13     8
     2     4    25
     2    10    28
     3     5    33
     4    25    23
     4    23    24
index = 10×1
     1
     3
     5
     6
     7
     8
     9
    11
    12
    13

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Deleting duplicates based on conditions of multiple columns

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

Deleting duplicates based on conditions of multiple columns

3 commentaires Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Plus de réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens