clusterization of data in 1-D vector

Question

0 votes

I have large logical vector looking as V = [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 ..............]

I need to find the position of each group of 1 (lets say - center of each group) but if two groups of ones are too close to each other (say, less than 3 zerros in between) I need to consider those groups as a single group. I.e. at the firs stage I need to find groups (bold-underlined elements) and then find the ceter element of each group (shift +/-1 element does not matter)

1st stage (clusterization): [0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 ..............]

2nd stage (find a center of each cluster): [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ..............]

The way I implemented now is following: I do smoothing of the entire vector (it is couple million elements). The span is chousen to be equal of maximum expected lenght of the group and then I look for local maxima (islocalmax) with 'MinSeparation' of minimum distace between groups. It works, but really slow (I have 360x180 = 64800 of vectors - yes, it is LAT/LONG grid with ~10M elements in each vector)

Is any way to speed up this? I believe it should be some "textbook" examples of it!

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Adam Danz le 28 Oct 2020

Modifié(e) : Adam Danz le 28 Oct 2020

Ouvrir dans MATLAB Online

0 votes

There are lots of alternatives.

Input A is a vector of 1s and 0s.
n is minimum number of 0s between 1s separate groups of 1s.
T is a table showing the start and stop index for each consecutive group of 1s split by less than n zeros and the length of each group.

A = [0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 1 1 1 1];
% Length of each group of consecutive 1s
T = table();
T.OnesLength = diff(find([0;A(:);0]==0))-1;
T(T.OnesLength==0,:) = []; 
% Index of 1st '1' in each group of consecutive 1s
T.OnesStart = find(diff([0;A(:)])==1);
% Index of last '1' in each group of consecutive 1s
T.OnesStop = T.OnesStart + T.OnesLength - 1; 
% Determine the number of 0s between consecutive 1s
ZerosBetween = [T.OnesStart(2:end) - T.OnesStop(1:end-1); NaN]-1;
disp(T)
    OnesLength    OnesStart    OnesStop
    __________    _________    ________

        3             4            6   
        3             9           11   
        6            18           23   
        2            29           30   
        1            32           32   
        2            34           35   
        1            37           37   
        4            42           45   
% join groups of consecutive 1s with less than n zeros between. 
n = 3; 
joinGroups = ZerosBetween < n;
t = find(diff([0;joinGroups])==1);
f = find(diff([0;joinGroups])==-1);
T.remove = false(height(T),1); 
for i = 1:numel(t)
    T.OnesStop(t(i)) = T.OnesStop(f(i));
    T.OnesLength(t(i)) = sum(T.OnesLength(t(i):f(i))) + sum(ZerosBetween(t(i):f(i)-1));  
    T.remove(t(i)+1:f(i)) = true; 
end
T(T.remove,:) = []; 
T.remove = [];
disp(T)
    OnesLength    OnesStart    OnesStop
    __________    _________    ________

        8             4           11   
        6            18           23   
        9            29           37   
        4            42           45   

Now you can use the segment length and the start/stop indices to compute the segement centers.

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

paganelle le 28 Oct 2020

Perfect way, thank you!

It is ~5 times faster than method I used previously.

Connectez-vous pour commenter.

clusterization of data in 1-D vector

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Community Treasure Hunt

clusterization of data in 1-D vector

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens