How to get random samples with certain distance between them?

Hello everybody,
I have the following problem: I have the following two colums which correspond to the x and y coordinate. It is like this but longer. (In my real case I have a matrix A(27889,2) which values goes from 0 to 5).
A = 0 0
1 0
2 0
3 0
4 0
0 1
1 1
2 1
3 1
4 1
0 2
1 2
My goal is to select random samples (2, 3, etc.) and that all samples are separated by a maximum and minimum distance (that is, between a range). I have made the following code that works perfectly, but it is not very robust due to the distance condition. This is because it is always selecting 3 random samples that are inside the range established, but if we make the range condition more and more reduced, the program is calculating for a long time.
n = prod(size(A));
nsemillas = 3; % For 3 random samples
min_dist = 2; % Minimum distance
max_dist = 5; % Maximum distance
while (true)
semillas_indexes = randperm(n,nsemillas);
[row,col] = ind2sub(size(A),semillas_indexes);
for i = 1:nsemillas
semilla(i,:) = A(row(1,i),:);
end
dist1 = sum(abs(semilla(1,:)));
dist2 = sum(abs(semilla(2,:)));
dist3 = sum(abs(semilla(3,:)));
dif1 = abs(dist1-dist2);
dif2 = abs(dist2-dist3);
dif3 = abs(dist1-dist3);
if dif1 > min_dist & dif2 > min_dist & dif3 > min_dist
if dif1 < max_dist & dif2 < max_dist & dif3 < max_dist
break;
end
else
continue;
end
end
I would like to know if there is any way to make the program more robust with this distance condition between samples.
Thanks in advance.
J.F.

 Réponse acceptée

You have a [27889 x 2] matrix containing values from 0 to 5. There are only 36 different possibilities of taking 2 out of 6 elements. This means that most of your input data are repeating. Then it is very unlikely to find a set of 3 rows, which are pairwise distinct from eachother.
You method to select values even allows row to be not unique. A better approach:
uA = unique(A, 'rows');
nA = size(uA, 1); % not prod(size(uA)), which is numel(uA) by the way
nsemillas = 3; % For 3 random samples
min_dist = 2; % Minimum distance
max_dist = 5; % Maximum distance
while true
row = randperm(nA, nsemillas);
semilla(i,:) = A(row, :);
dist1 = sum(semilla(1,:)); % No need for ABS() here
dist2 = sum(semilla(2,:));
dist3 = sum(semilla(3,:));
dif1 = abs(dist1-dist2);
dif2 = abs(dist2-dist3);
dif3 = abs(dist1-dist3);
if dif1 > min_dist && dif2 > min_dist && dif3 > min_dist
if dif1 < max_dist && dif2 < max_dist && dif3 < max_dist
break;
end
end
end
I'm not really sure, what you want to achieve. How do you define "distance"? Why are you searching in a non-unique data set? Can you provide some input data and the wanted output`?

3 commentaires

Thanks for your reply.
The objective is to obtain from database A, three seeds that are separated by a distance defined by a range. I can give you more information.
The real input is a matriz A with the following values, being A a matrix 27889x2 where column1 is the axis X and de column2 the axis Y.
In that way, it would be like the following matrix (Note: this matrix doesn't must be use to make the calcule, is it is only to be able to locate the data visually): Is a 167x167 matrix.
The goal is to select randomly three samples, for example imagine the row2,column3 which value is 0.06. And the other 2 seeds, must to be in a range which minimum could be 0.12, for this example, the next seed could be as a minimum 4 rows lower, or 4 columns right, or maybe 2 rows lower and 2 columns right (or whatever kind of combination). In this way, the third seed should be also with a minimum distance respect the other two.
But remember, this is for understand the database A (is not possible use the matrix 167x167), that is which we have to solve the problem. In that way, I think that the method that I exposed is fine to obtain the three different seed (I will check deeply your method). The problem is when I try to implement the conditions of the range (maximum and minimum distance). In the way that I made, is not a robust way.
The output data would be the seeds value (row including axis X and Y) and also the position (row) in the 27889 rows.
I would like to know if there is any function like pdist2 (I have tried and i didn't get nothing) in order to:
  • Create a first seed
  • Define the values that are avaible to be selected (range with minimum and maximum distance)
  • Select second seed
  • Repeat step 2 but with the both seeds
  • Create seed 3.
Sorry for this long reply, but I wanted to make me understand because is difficult to explain for me.
Thank you very much in advance for your time reading me. If you can't understand me, don't worry, I'll try to figure it out on my own. Thanks a lot.
J.F.
Jan
Jan le 15 Fév 2021
Modifié(e) : Jan le 15 Fév 2021
As far as I understand, the input matrix A consists of pairwise distinct rows. Is this correct?
I do not understand how A and the 167x167 matrix are related. This matrix is symmetric, so is this the pairwise distance between the rows?
Because your matrix is small with 27889x2 elements, it is possible to determine all distances between 2 points: This needs 3.1GB of data. Based on this set you could select matching 2 points and determine a 3rd point dynamically. Would a single precision satisfy your needs also? This would reduce the memory consumption by 50%.
You have mentioned a distance. The code "dist1 = sum(semilla(1,:));" looks confusing, because this is a unusual method to define a distance. If all you need is really a sum of the components, why not calculating sum(A,1) before searching the points? So explain your mathematical definition of "distance".
Can you post a small working example, which produces the wanted output? This would clarify your needs.
Thanks for your answer and for the desire to help.
Thanks to your messages I have restructured the way of doing it and I have achieved what I wanted.
Once again, thank you very much for the help.
J.F.

Connectez-vous pour commenter.

Plus de réponses (0)

Produits

Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by