Asked by Abdulmajeed Altassan
on 12 Nov 2019 at 18:32

I have this loop to calculate the distance between all of the points in R_all array and delete the second point if the distace less that 0.002, but if I have a huge number of points like 100000 it takes long time, I need to vectorize my code, if you can help me ,, thank you in advance,,

Rx=rand(n,1)*0.2;

Ry=rand(n,1)*0.2;

Rz=rand(n,1)*0.2;

R_all=[Rx Ry Rz];

n= 1000;

Df=0.002;

while j<n

i=j+1;

while i<=n

k = norm(R_all(j,:)-R_all(i,:)); %function of distance between points

if k < 1.5*Df % check the distance between all points; should not be < 1.5 Df

R_all(i,:)=[];

n=n-1;

end

i=i+1;

end

j=j+1;

end

Answer by Rik
on 13 Nov 2019 at 9:52

Accepted Answer

After a few runs I think this would be equivalent to your loop version. Note that it generates a matrix of size [n n 3] in one of the steps, so memory might become an issue. The pdist function would probably help, but I don't have the statistics and machine learning toolbox.

n= 1000;

Rx=rand(n,1)*0.2;

Ry=rand(n,1)*0.2;

Rz=rand(n,1)*0.2;

R_all=[Rx,Ry,Rz];

Df=0.002;

%calculate distance matrix for every point pair

A=permute(R_all,[1 3 2]);

B=permute(R_all,[3 1 2]);

dist=sqrt( sum( (A-B).^2 , 3) );

%mask the distance to the point itself and to all previous points

dist(1:(1+size(dist,1)):end)=inf;

dist(logical(triu(ones(size(dist)))))=inf;

L=dist<1.5*Df;

L=any(L,1);

R_all(L,:)=[];

Abdulmajeed Altassan
on 13 Nov 2019 at 11:03

Hello Rik;

I tried to run you code but it shows me error in dist=sqrt( sum( (A-B).^2 , 3) );

''Array dimensions must match for binary array op.''

this massage appears, thank you Mr. Rik

Rik
on 13 Nov 2019 at 13:02

That means you're using a release prior to R2016b. If you're using an older version it is always a good idea to mention which release you are using.

In this case you can do the implicit expansion like this:

dist=sqrt( sum( bsxfun(@minus,A,B).^2 , 3) );

Answer by Jeremy Marcyoniak
on 12 Nov 2019 at 18:49

Edited by Jeremy Marcyoniak
on 12 Nov 2019 at 19:00

Something like this?

n= 1000;

Rx=rand(n,1)*0.2;

Ry=rand(n,1)*0.2;

Rz=rand(n,1)*0.2;

R_all=[Rx Ry Rz];

Df=0.002;

Dr = diff(R_all,[],1); % compute differences between each row

% changing the third input from 1 to 2 makes diff compute column differences instead

[i,j,k] = find(Dr<1.5*Df); % get indices of out-of-tolerance lines

i = i + 1; % Delete the second of the lines that makes an OOT result

R_all(i,:) = []; % remove lines that are out-of-tolerance

Jeremy Marcyoniak
on 12 Nov 2019 at 19:01

Made a correction to the FIND line

Abdulmajeed Altassan
on 13 Nov 2019 at 5:35

Thank you Jeremy;

this code does not find a distance as a value (magnitude) to compare with 1.5 Df. also the loop that I showed in my question was calculating each point with all other points that mean calculate the distance between point 1 with all points from 2 to 1000 then point 2 with all points from 3 to 1000 and so on. many thanks for your help.

