Effacer les filtres
Effacer les filtres

Excluding data not of a classes from a KNN classifier

2 vues (au cours des 30 derniers jours)
Nathan Garner
Nathan Garner le 11 Mai 2021
Réponse apportée : Ayush le 5 Juin 2024
I have a dataset made of spectral data I’m building a KNN classifier for using the mechine and statstical learning toolbox. This data is going to have new data added that isn’t necessarily from any of the trained classes. I’m now trying to build a system that would detect data that is too dissimilar from any of the trained classes and deny classification. My current idea for this is to use the evidence value generated with the use of weighted distance measures and set a threshold value for this value at which it is declared too low and denied classification. Currently I don’t know how to access this value as the score function only provides normalised results. I’d appreciate any advice either how to access the sum of weighted distances or an alternate approach that would allow me to achieve my goal.

Réponses (1)

Ayush
Ayush le 5 Juin 2024
Hi,
To detect data that is too dissimilar from any of the trained classes in your KNN classifier, you can compute the distances from a new observation to all points in the training set, weighting them as necessary, and then set a threshold for classification denial based on these distances. Refer to the pseudo code below for better understanding:
% Example training data
X_train = [rand(100,2)*10; rand(100,2)*10+5]; % 200x2 matrix of features
Y_train = [ones(100,1); zeros(100,1)]; % 200x1 matrix of labels
% Train KNN
knnModel = fitcknn(X_train, Y_train, 'NumNeighbors', 5); % Adjust 'NumNeighbors' as needed
% New observation
X_new = [7, 7]; % Example new data point
% Compute Euclidean distances
distances = sqrt(sum((X_train - X_new).^2, 2));
% Weighting scheme: Inverse of distance
weights = 1 ./ distances;
weights(isinf(weights)) = 0; % Handle division by zero for exact matches
% Calculate sum of weighted distances
sumWeightedDistances = sum(weights);
% Define threshold
threshold = 0.5; % Example threshold, adjust based on experimentation
% Check against threshold
if sumWeightedDistances < threshold
disp('New observation is too dissimilar, classification denied.');
else
% Proceed with classification
[label, score, cost] = predict(knnModel, X_new);
disp(['Classification accepted. Predicted label: ', num2str(label)]);
end
Classification accepted. Predicted label: 1
So, I have used the "fitcknn" function to train my KNN classifier. I have used Euclidean distance to find the distances for new observation data. Finally, based on the threshold value, which could be based on the sum of weighted distances, the minimum distance, or another statistic that makes sense for your application, you can remove the unwanted observation data; for the above example, the sum of weighted distances is used for comparison with a threshold.
The documentation of the "fitcknn" function is as follows:

Catégories

En savoir plus sur Statistics and Machine Learning Toolbox dans Help Center et File Exchange

Produits


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by