Finding the indices of duplicate values in one array

Given one array A=[ 1 1 2 3 5 6 7].
I need help to known the indices where there are duplicate values.
Thanks

Réponses (9)

A = [1 2 3 2 5 3]
[v, w] = unique( A, 'stable' );
duplicate_indices = setdiff( 1:numel(A), w )
this should work too, and is elegant

2 commentaires

Jun W
Jun W le 11 Nov 2019
How about finding how many times are those elements repeated?
Use histcounts and look for bins with more than 2 counts.
A = [1 2 3 2 5 3]
[counts, edges] = histcounts(A)
A =
1 2 3 2 5 3
counts =
1 2 2 0 1
edges =
Columns 1 through 5
0.5 1.5 2.5 3.5 4.5
Column 6
5.5
You can see that the bins for 2 and 3 both have 2 counts so there are multiples of 2 and 3 in A.
Note: This will find any repeats, and they don't have to be consecutive. If you want to look for consecutive repeats, call the diff() function and look for zeros.

Connectez-vous pour commenter.

Image Analyst
Image Analyst le 11 Mai 2018
Modifié(e) : Image Analyst le 12 Mai 2018
Here's one way:
A = [-2 0 1 1 2 3 5 6 6 6 7 11 40]
% Elements 3, 4, 8, 9, and 10 are repeats.
% Assume A is integers and get edges
edges = min(A) : max(A)
[counts, values] = histcounts(A, edges)
repeatedElements = values(counts >= 2)
% Assume they're integers
% Print them out and collect indexes of repeated elements into an array.
indexes = [];
for k = 1 : length(repeatedElements)
indexes = [indexes, find(A == repeatedElements(k))];
end
indexes % Report to the command window.
You get [3,4,8,9,10] as you should.

8 commentaires

Arthur, with your new array A = [29892, 29051, 29051], my code still works. It returns 2 and 3. Can you tell me why you're still trying to use Adam's code even after I told you it doesn't work but mine does?
A = [29892, 29051, 29051];
% Elements 2 and 3 are repeats.
% Assume A is integers and get edges
edges = min(A) : max(A);
[counts, values] = histcounts(A, edges);
repeatedElements = values(counts >= 2)
% Assume they're integers
% Print them out and collect indexes of repeated elements into a cell array.
indexes = [];
for k = 1 : length(repeatedElements)
indexes = [indexes, find(A == repeatedElements(k))];
end
indexes % Report to the command window.
because I don't have 'histcounts' function. Where do I find it?
Image Analyst
Image Analyst le 12 Mai 2018
Modifié(e) : Image Analyst le 12 Mai 2018
Then you have a version older than R2014b. What version do you have? This should work in old versions:
A = [29892, 29051, 29051];
% Elements 2 and 3 are repeats.
% Assume A is integers and get edges
edges = min(A) : max(A);
[counts, values] = histc(A, edges);
repeatedElements = edges(counts >= 2)
% Assume they're integers
% Print them out and collect indexes of repeated elements into an array.
indexes = [];
for k = 1 : length(repeatedElements)
indexes = [indexes, find(A == repeatedElements(k))];
end
indexes % Report to the command window.
I have the 2013a version. and... IT WORKED! Thank you so much Image Analyst! I think my problem is solved now! Have a nice weekend!
You save my life (indirectly) again, Mr Image Analyst. Thank you so much. You helped someone else, then your help will be a good answer for the others, like me, lol.
I know this is an old thread but if anyone is still out there... I see a flaw in this method. Change A to include a duplicate of 40: A = [-2 0 1 1 2 3 5 6 6 6 7 11 40 40]
You still get [3,4,8,9,10] as a result. This method doesn't pick up the duplicate of 40. The reason is because counts is one index short of edges and it misses that last duplicate. Seems to be inherent in how histcounts works. Just need to be aware of that.
The last bin includes both the left and right edges, while the earlier bins include only the left edges. This is stated in the description of the edges input on the histcounts documentation page: "Bin edges, specified as a vector. The first vector element specifies the leading edge of the first bin. The last element specifies the trailing edge of the last bin. The trailing edge is only included for the last bin."
So add on a number that's greater than the maximum element in your data. Inf is a good choice.
A = [-2 0 1 1 2 3 5 6 6 6 7 11 40 40]
A = 1×14
-2 0 1 1 2 3 5 6 6 6 7 11 40 40
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
edges = [min(A):max(A) Inf];
[counts1, edges1] = histcounts(A, edges);
repeatedElements = edges1(counts1 >= 2)
repeatedElements = 1×3
1 6 40
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Or you could use non-equally spaced bins containing the unique elements from your data.
[counts2, edges2] = histcounts(A, [unique(A) Inf])
counts2 = 1×10
1 1 2 1 1 1 3 1 1 2
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
edges2 = 1×11
-2 0 1 2 3 5 6 7 11 40 Inf
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
repeatedElements = edges2(counts2 >= 2)
repeatedElements = 1×3
1 6 40
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
If your data spans a wide range this can reduce the number of bins histcounts uses.
whos counts1 edges1 counts2 edges2
Name Size Bytes Class Attributes counts1 1x43 344 double counts2 1x10 80 double edges1 1x44 352 double edges2 1x11 88 double
The unique approach uses 10 bins, the non-unique approach uses 43. This is a fairly small difference for your sample A, but the impact is much larger if you have a distant outlier.
B = [A 5000]; % 5000 is far from the rest of the elements in A
edges = [min(B):max(B) Inf];
[counts1, edges1] = histcounts(B, edges);
[counts2, edges2] = histcounts(B, [unique(B) Inf]);
whos counts1 edges1 counts2 edges2
Name Size Bytes Class Attributes counts1 1x5003 40024 double counts2 1x11 88 double edges1 1x5004 40032 double edges2 1x12 96 double
Using edges = [min(B):max(B) Inf]; assumes that the input data is integer.

Connectez-vous pour commenter.

Adam
Adam le 21 Avr 2017
Modifié(e) : Adam le 21 Avr 2017
[~, uniqueIdx] = unique( A );
duplicateLocations = ismember( A, find( A( setdiff( 1:numel(A), uniqueIdx ) ) ) );
then
find( duplicateLocations )
will give you the indices if you want them rather than a logical vector.
There are probably neater methods though.
If you want only the duplicates after the first then simply
setdiff( 1:numel(A), uniqueIdx )
should do the job.

9 commentaires

I tried this:
[U,I]=unique(A(:,1)); repeated=setdiff(1:size(A,1),I)
but MATLAB returns me this -> Error using unique Too many input arguments. Error in setdiff>setdiffR2012a (line 505) c = unique(c,order); Error in setdiff (line 84) [varargout{1:nlhs}] = setdiffR2012a(varargin{:});
Can somebody help me? I'm new here
Adam, this doesn't work in general:
A=[ 1 1 2 3 5 6 6 7]
[~, uniqueIdx] = unique( A );
duplicateLocations = ismember( A, find( A( setdiff( 1:numel(A), uniqueIdx ) ) ) )
indexes = find(duplicateLocations)
It gets the wrong indexes for the repeated 6's:
duplicateLocations =
1×8 logical array
1 1 1 0 0 0 0 0
indexes =
1 2 3
It should give [1,2,6,7].
Arthur, your code worked for me for the A given. If it doesn't work for you, give us your A.
My A is an arbitrary vector, like this one you used here. I would like to know why this error occurs and try to fix it. I have to find these indexes to use them on another vector. Don't know why, but the A you showed here didn't work for me =/
I am trying with an A like this: A = [29892, 29051, 29051]; But it still doesn't wokr for me.
Jan
Jan le 12 Mai 2018
"Doesn't work" is a weak description of the problem. Please post the error message or explain the difference between the results and your expectations.
This is the error message -> Error using unique Too many input arguments. Error in setdiff>setdiffR2012a (line 505) c = unique(c,order); Error in setdiff (line 84) [varargout{1:nlhs}] = setdiffR2012a(varargin{:});
My problem is the same as the topic of this forum: Finding the indices of duplicate values in one array. I use the same solution that has been put here, but only this error message is returned to me
Commenting here as it's led me to overall the best answer here, it just has a mistake. The "find" in the 2nd line changes the values into indices before passing to ismember, which just makes the output nonsense. I removed that. Using the same numbers as image analyst above:
A=[ 1 1 2 3 5 6 6 7]
A = 1×8
1 1 2 3 5 6 6 7
[~, uniqueIdx] = unique(A);
dupeIdx = ismember( A, A( setdiff( 1:numel(A), uniqueIdx ) ) );
dupes = A(dupeIdx)
dupes = 1×4
1 1 6 6
dupeLoc = find(dupeIdx)
dupeLoc = 1×4
1 2 6 7
This works, thanks

Connectez-vous pour commenter.

Jan
Jan le 12 Mai 2018
Modifié(e) : Jan le 2 Juil 2021
function Ind = IndexOfMultiples(A)
T = true(size(A));
off = false;
A = A(:);
for iA = 1:numel(A)
if T(iA) % if not switched already
d = (A(iA) == A);
if sum(d) > 1 % More than 1 occurrence found
T(d) = off; % switch all occurrences
end
end
end
Ind = find(~T);
end
If the input has more than 45 elements, this is faster:
function T = isMultiple(A)
% T = isMultiple(A)
% INPUT: A: Numerical or CHAR array of any dimensions.
% OUTPUT: T: TRUE if element occurs multiple times anywhere in the array.
%
% Tested: Matlab 2009a, 2015b(32/64), 2016b, 2018b, Win7/10
% Author: Jan, Heidelberg, (C) 2021
% License: CC BY-SA 3.0, see: creativecommons.org/licenses/by-sa/3.0/
T = false(size(A));
[S, idx] = sort(A(:).');
m = [false, diff(S) == 0];
if any(m) % Any equal elements found:
m(strfind(m, [false, true])) = true;
T(idx) = m; % Resort to original order
end
end
MRINAL BHAUMIK
MRINAL BHAUMIK le 28 Juin 2021
Modifié(e) : Walter Roberson le 3 Avr 2025
A=[ 1 1 2 3 5 6 7 6]
A = 1×8
1 1 2 3 5 6 7 6
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
B = A'./A
B = 8×8
1.0000 1.0000 0.5000 0.3333 0.2000 0.1667 0.1429 0.1667 1.0000 1.0000 0.5000 0.3333 0.2000 0.1667 0.1429 0.1667 2.0000 2.0000 1.0000 0.6667 0.4000 0.3333 0.2857 0.3333 3.0000 3.0000 1.5000 1.0000 0.6000 0.5000 0.4286 0.5000 5.0000 5.0000 2.5000 1.6667 1.0000 0.8333 0.7143 0.8333 6.0000 6.0000 3.0000 2.0000 1.2000 1.0000 0.8571 1.0000 7.0000 7.0000 3.5000 2.3333 1.4000 1.1667 1.0000 1.1667 6.0000 6.0000 3.0000 2.0000 1.2000 1.0000 0.8571 1.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
B = B-diag(diag(B))
B = 8×8
0 1.0000 0.5000 0.3333 0.2000 0.1667 0.1429 0.1667 1.0000 0 0.5000 0.3333 0.2000 0.1667 0.1429 0.1667 2.0000 2.0000 0 0.6667 0.4000 0.3333 0.2857 0.3333 3.0000 3.0000 1.5000 0 0.6000 0.5000 0.4286 0.5000 5.0000 5.0000 2.5000 1.6667 0 0.8333 0.7143 0.8333 6.0000 6.0000 3.0000 2.0000 1.2000 0 0.8571 1.0000 7.0000 7.0000 3.5000 2.3333 1.4000 1.1667 0 1.1667 6.0000 6.0000 3.0000 2.0000 1.2000 1.0000 0.8571 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[pos1 pos2]=find(B==1)
pos1 = 4×1
2 1 8 6
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
pos2 = 4×1
1 2 6 8
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
In MATLAB, you can find the indices of duplicate values in an array using the `find` function along with the `unique` function. Here's how you can do it:
A = [1 1 2 3 5 6 7];
% Finding the unique elements in the array
unique_elements = unique(A);
% Initializing an empty array to store the indices of duplicate values
duplicate_indices = [];
% Iterating through each unique element
for i = 1:numel(unique_elements)
% Finding the indices of occurrences of the current unique element
indices = find(A == unique_elements(i));
% If there are more than one occurrence, add the indices to the duplicate_indices array
if numel(indices) > 1
duplicate_indices = [duplicate_indices indices];
end
end
% Displaying the indices of duplicate values
disp(duplicate_indices);
1 2
Running this code will give you the indices of the duplicate values in the array A. In this case, the output will be: 1 2
This means that the duplicate values are located at indices 1 and 2 in the array A.
Here is my solution to find repeated values and their counts
function [dup, counts] = duplicates(A)
[dup,~,n] = unique(A, 'rows', 'stable');
counts = accumarray(n, 1, [], @sum);
dup(counts==1) = [];
counts(counts==1) = [];
Hello,
here is my attempt to solve it. I faced similar problem but in my case I wanted to have the result in two column representation. Each row contains indices of repeated values.
A = [ 1 1 2 3 5 6 7 6];
nk = nchoosek(1:length(A),2);
nk(diff(A(nk),[],2)~=0,:) = [];
disp(nk)
Cheers, Piotr
Tim
Tim le 3 Avr 2025
Modifié(e) : Tim le 3 Avr 2025
Posting @CompViscount's comment as an answer because it gives a logical array identifying which elements have more than one entry in the array. This is the answer I need.
A=[ 1 1 2 3 5 6 6 7]
A = 1×8
1 1 2 3 5 6 6 7
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[~, uniqueIdx] = unique(A);
dupeIdx = ismember( A, A( setdiff( 1:numel(A), uniqueIdx ) ) )
dupeIdx = 1x8 logical array
1 1 0 0 0 1 1 0
A = [-2 0 1 1 2 3 6 5 6 6 6 7 11 40 40]
A = 1×15
-2 0 1 1 2 3 6 5 6 6 6 7 11 40 40
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[~, uniqueIdx] = unique(A);
dupeIdx = ismember( A, A( setdiff( 1:numel(A), uniqueIdx ) ) )
dupeIdx = 1x15 logical array
0 0 1 1 0 0 1 0 1 1 1 0 0 1 1

Catégories

Produits

Version

R2013a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by