# How to efficiently find and step through similar values in a vector.

7 vues (au cours des 30 derniers jours)
Serge le 15 Sep 2022
Modifié(e) : Serge le 18 Sep 2022
For a vector, such as:
X = [5 5 5 2 2 ..]
Is there a native, no toolbox, elegant way to identify the length of contiguous 'regions', eg
len = [3 2 ...]
And efficiently group their indices, eg
ind = {[1 2 3] [4 5] ...}
My use case is to quickly step through a large number of regions in a loop.
In my example i have a fast but ugly solution that only works if X is sorted.
n = 1e5; %vector length
X = sort(round(rand(n,1)*n)); %sorted values
tic
[U,~,J] = unique(X); %find similar 'regions' in a vector
for k = 1:numel(U) %step through regions
I = find(k==J)'; %find index of region elements (SLOW!)
%do stuff
end
toc %~10sec
tic
ind = group_contiguous_values(X); %find index of contiguous regions, X must be sorted
for k = 1:numel(ind)
I = ind{k};
%do stuff
end
toc %~100x times faster, time scales linearly with n
function ind = group_contiguous_values(X)
%Group contiguous values and list their indices (only works if X is sorted)
assert(issorted(X),'X must be sorted')
[~,~,J] = unique(X);
len = diff([0;find(diff([J(:);-1])~=0)]); %find length of each 'region', eg x=[5 5 5 2 2 ..] > len=[3 2 ..]
ind = mat2cell(1:numel(J),1,len); %indecies for each region, eg ind={[1 2 3] [4 5] ..}
end
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

### Réponse acceptée

Matt J le 15 Sep 2022
Modifié(e) : Matt J le 15 Sep 2022
X = [5 5 5 2 2 3 3 3 3 3 6 6 6];
G=groupConsec(X);
[starts,stops,runlengths]=groupLims(G,1)
starts =
1 4 6 11
stops =
3 5 10 13
runlengths =
3 2 5 3
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
Serge le 18 Sep 2022
Modifié(e) : Serge le 18 Sep 2022
Thank you,
I was hopping for a one liner using native MatLab, but I see now it warrants a function.
Here is my own version I ended up using:
function [L,I,V,S,E] = groupconsec(X)
%Group consecutive values in a vector.
% [L,I,V,S,E] = groupconsec(X) -list of values, eg X=[5 5 5 2 2 ..]
%L: Length of each group eg L=[3 2 ..]
%I: Index of each element in X eg I={[1 2 3] [4 5] ..}
%V: Value of X for each group eg V=[5 2 ..]
%S: Start index for each group eg S=[1 4 ..]
%E: End index for each group eg E=[3 5 ..]
%
%Remarks:
%-Nonfinite values (NaN Inf -Inf) are all treated as NaN.
%-[I] can be used to easily step through 'regions' of X.
%-All outputs are row vectors, regardles of the size of X.
%
%Example:
% [L,I,V,S,E] = groupconsec([5 5 5 2 2 Inf NaN 5])
%checks
if isempty(X)
[L,I,V,S,E] = deal([],{},[],[],[]); return %edge case
end
X = X(:).'; %ensure X is a row vector
nonval = rand; %dummy value to represent nonfinite values
X(~isfinite(X)) = nonval; %treat all non finite values (NaN Inf -Inf) as being the same
B = [true logical(diff(X))]; %Bounderies for each region, eg B=[1 0 0 1 0 ..]
L = diff(find([B true])); %Length of each region, eg L=[3 2 ..]
if nargout>1
I = mat2cell(1:numel(X),1,L); %Index of each element per group, eg I={[1 2 3] [4 5] ..}
end
if nargout>2
V = X(B); %Value of X for each group, eg V=[5 2 ..]
V(V==nonval) = NaN;
end
if nargout>3
S = find(B); %Start index for each group, eg S=[1 4 ..]
end
if nargout>4
E = S+L-1; %End index for each group, eg E=[3 5 ..]
end
I will delete the last part of the question, for non consecutive simular values, as i don't need it.

Connectez-vous pour commenter.

### Catégories

En savoir plus sur Get Started with MATLAB dans Help Center et File Exchange

R2022a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by