MATLAB Answers

ranking (ordering) values with repeats

161 views (last 30 days)
owr
owr on 23 Mar 2012
Commented: Nataraja M on 26 Mar 2018
Hello Community,
Im hoping some of you have a clever solution to this problem. Im looking for fast and efficient way to rank (order)a vector of numbers in a particular way when repeated values arise.
To make it simple, suppose I have a row vector:
data = [-1 2 0 -2 0]
I know I can rank them using the 3rd output of "unique":
>> [~,~,rnk] = unique(data)
rnk =
2 4 3 1 3
What I like about this is that it assigns the same rank to the repeated zeros. What I don't like about this is that the top rank is now "4" even though I have 5 values. I would prefer this:
>> rnk = myrank(data)
rnk =
2 5 3 1 3
Ive also played around with the second output of "sort" quite a bit, but since this output produces indicies of the sorted values within the original array, there is no simple way (that I've found) to associate the same rank with repeated values.
Im just wondering if there is something simple that Im missing.
Thanks!

Accepted Answer

Oleg Komarov
Oleg Komarov on 23 Mar 2012
If you have the Statistics Toolbox, but it's kinda an overkill:
floor(tiedrank([-1 2 0 -2 0]))
ans =
2 5 3 1 3
Otherwise:
data = [-1 2 0 -2 0];
% Sort data
[srt, idxSrt] = sort(data);
% Find where are the repetitions
idxRepeat = [false diff(srt) == 0];
% Rank with tieds but w/o skipping
rnkNoSkip = cumsum(~idxRepeat);
% Preallocate rank
rnk = 1:numel(data);
% Adjust for tieds (and skip)
rnk(idxRepeat) = rnkNoSkip(idxRepeat);
% Sort back
rnk(idxSrt) = rnk
rnk =
2 5 3 1 3
  4 Comments
Tommaso Fornaciari
Tommaso Fornaciari on 4 Dec 2016
Hi is there a way to assign equal observations two different subsequent ranks? following on the original question, the output i would need is 2 5 4 1 3 or 2 5 3 1 4
Thank you

Sign in to comment.

More Answers (4)

Raph
Raph on 4 May 2015
It should also work with sort() and ismember()
data_sorted = sort(data);
[~, rnk] = ismember(data,data_sorted)
  1 Comment
Brad Stiritz
Brad Stiritz on 28 May 2016
Very impressive, Raph! Thanks for your contribution. Excellent use of built-in vectorized functions.

Sign in to comment.


sunbeam
sunbeam on 6 Mar 2013
This should work. I couldn't figure out how to do it without a loop, but at least this only loops over the duplicate entries. Someone let me know if you come up with a better way.
function outrank = rankWithDuplicates(data,mode)
% R = rankWithDuplicates(data,mode) ranks the values in the data variable
% according to size, allowing for duplicates. Whereas sort actually
% rearranges the input, and therefore duplicates get assigned different
% indices, rankWithDuplicates will simply output the rank order allowing
% ties for duplicate entries. For example,
%
% rankWithDuplicates([1 1 5 8 8 10])
%
% will output [1 1 3 4 4 6]; and if these entries are shuffled like
%
% rankWithDuplicates([8 1 5 1 10 8])
%
% the output will be [4 1 3 1 6 4].
%
% INPUT: data, a vector of real numbers.
% mode, an optional input which can be 'ascend' or 'descend'
%
% OUTPUT: the rank order of the input data.
%
if nargin==1
mode='ascend';
end
[~,b]=size(data);
if b==1
data=data';
end
% Sort data
[srt, idxSrt] = sort(data,mode);
% Find where are the repetitions and negate
idxRepeat = [false diff(srt) == 0];
% Loop through where there are duplicates and maintain the rank.
% I'm not sure if this is necessary but it's the only way I could get it
% done.
rnk = 1:numel(data);
loopidx=find(idxRepeat>0);
for i=loopidx
rnk(i)=rnk(i-1);
end
% Return order according to original sort
outrank(idxSrt)=rnk;

Jeyamugan T
Jeyamugan T on 7 Apr 2017
I wrote this code for some other purpose but it may useful for this problem.
function [rkList]=arrayRankEx(O)
cO=sort(O);
n=size(O,2);
rkList=zeros(1,n);
in=1;
while(in<=n)
out=1;
co=0;
while(out<=n && in<=n)
if(O(out)==cO(in))
rkList(out)=in;
co=co+1;
end
out=out+1;
end
in=in+co;
end
end
>>[5 7 -2 1 -1 0 0 1 5 3]
ans =
5 7 -2 1 -1 0 0 1 5 3
>> arrayRankEx([5 7 -2 1 -1 0 0 1 5 3])
ans =
8 10 1 5 2 3 3 5 8 7

Benjamin Levy
Benjamin Levy on 16 Nov 2017
Not sure if this is still a 'live' thread, but the code should report these ranks for ascending order: 2.0000 5.0000 3.5000 1.0000 3.5000.
Now, suppose your data set is data = [ 11 20 2 14 15 11 13 20 7 9 1 5 17... 7 5 16 3 5 20 ]; Your answer for ascending order (correcting for ties), using sortrows([ data' ranks ],2), should provide column 1 = data, column 2 = ranks:
1.0000 1.0000
2.0000 2.0000
3.0000 3.0000
5.0000 5.0000
5.0000 5.0000
5.0000 5.0000
7.0000 7.5000
7.0000 7.5000
9.0000 9.0000
11.0000 11.0000
11.0000 11.0000
11.0000 11.0000
13.0000 13.0000
14.0000 14.0000
15.0000 15.0000
16.0000 16.0000
17.0000 17.0000
20.0000 19.0000
20.0000 19.0000
20.0000 19.0000
Note that there are several sections in the sorted data wherein there are consecutive runs of same integers (e.g., ...5 5 7 7 ).
Using your code and my data set, and the same final sort, I have (column 1 data, column 2 ranks):
1 1
2 2
3 3
5 4
5 4
5 4
7 5
11 7
7 7
11 7
9 9
11 10
13 13
20 13
20 13
14 14
15 15
16 16
17 17
20 18
  1 Comment
Nataraja M
Nataraja M on 26 Mar 2018
Hello Sir I used above command sortrows([ data' ranks ],2) for ranking vectors from maximum to lowest, but facing error like Not enough input arguments. Can you please help me to solve this error Thank you

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by