Effacer les filtres
Effacer les filtres

Predicting some point between choices using distance matrix from pdist()

2 vues (au cours des 30 derniers jours)
Faris Jan
Faris Jan le 4 Oct 2020
I have this distance matrix for kNN points (given from the function pdist()) and I'm trying to predict if point 6 is either ‘unacceptable’ or ‘acceptable’ using the kNN technique with the 3 nearest neighbor points. In the matrix, points 1 and 4 are labeled as ‘acceptable’ and points 2, 3, and 5 are labeled as ‘unacceptable’. I think I have to use fitcknn function and the Name-Value Pair Arguments: 'Distance','Euclidean','DistanceWeight','inverse’. But other than that I have no idea how to go about it... How do you use fitcknn() in a case like this and then predict()? I know the inverse distance is >>inverseDistances = 1 ./ d with 'd' being the points given below but can I use this answer somehow?

Réponses (1)

Harsh Parikh
Harsh Parikh le 7 Oct 2020
Hi,
'fitcknn()' function is used to fit k-nearest neighbor classifier.
Typical syntax of 'fitcknn()' is as below:
fitcknn(Train_Data, Labels, <Name_Value_Pairs>)
The function has been designed to work with the original data-points.
However, if you don't have the source data, from which the distance matrix has been generated, at hand you can use Multidimensional Scaling to get the approximate source data and fit KNN classifier as shown below:
x = [0 0.67 1.28 0.94 0.72 0.64;
0.67 0 0.74 0.83 0.91 0.97;
1.28 0.74 0 0.7 1.03 1.2;
0.94 0.83 0.7 0 0.37 0.56;
0.72 0.91 1.03 0.37 0 0.19;
0.64 0.97 1.2 0.56 0.19 0];
raw_x = cmdscale(x); % Get the approximate source data back
Y = ["Accepted","Unaccepted","Unaccepted","Accepted","Unaccepted"];
X_train = raw_x(1:5,:); % Extracting training data
Mdl_new = fitcknn(X_train,Y,'NumNeighbors',3); % Fit KNN
Mdl_new.predict(raw_x(6,:)) % Predict for the sixth point
If you don't want to get the source data back and make the prediction based only on the data available, you can refer the following piece of code: (Refer the comments for the algorithm)
x = [0 0.67 1.28 0.94 0.72 0.64;
0.67 0 0.74 0.83 0.91 0.97;
1.28 0.74 0 0.7 1.03 1.2;
0.94 0.83 0.7 0 0.37 0.56;
0.72 0.91 1.03 0.37 0 0.19;
0.64 0.97 1.2 0.56 0.19 0];
Y = ["Accepted","Unaccepted","Unaccepted","Accepted","Unaccepted"];
n = 6; % Number of data points
x(1:n+1:end) = inf; %// self-distance doesn't count so make it infinity
[~, y] = sort(x,2); % Sort based on the distance
y = y(:,1:n-1);
cc = Y(y(6,1:3)); %Labels for the closest 3 neighbors
% Find out the majority element from 'cc'
[a,b,c] = unique(cc);
predict_fin = a{mode(c)};

Catégories

En savoir plus sur Statistics and Machine Learning Toolbox dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by