K means Clusteing with Euclidean Distace

Question

Saeed Siddiqui le 4 Jan 2021

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/708803-k-means-clusteing-with-euclidean-distace

Réponse apportée : Rishabh Mishra le 7 Jan 2021

I have 1,000 data values and i want to do K means clustering where i have 10 centroids so it is not random starting. The equlidean distance for the data values needs to be equal or less than 0.01. Therefore i need my K means Clustering to have several iterations, on each iteration the latest centroids are used.

My current code does the first iteration, it works out the new centroids(C) and i manually work out the euclidean distance.

My question -- How do i make this repeat so that i can get more iterations (unknown amount) and carry on untill I get the euclidean distance to be equal or less than 0.01? Also is there a better way to calculate the euclidean distance for each iteration?

The data is one dimensional !!

%load the data%
X = importdata('data');
Centroid = importdata('Centroids');
ep = 0.01;
%C is the new centroid values
%grp is the corresponding original centroid 
[grp,C] = kmeans(X,10,"Start",Centroid);
%Calculating Euclidean distance manually
%getting the corresponding values of g)%
d=Centroid(grp);
%Dist is the distance of the data value from the centroid %
dist=X-d;
%calculating the euclidean distance%
euclidean = (sum(abs(dist)))/1000

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Rishabh Mishra le 7 Jan 2021

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/708803-k-means-clusteing-with-euclidean-distace#answer_592918

Hi,

Based on my understanding of the issue described by you. I would like to highlight a few points, as follows:

k-means clustering, or Lloyd’s algorithm, is an iterative, data-partitioning algorithm. No further explicit iterations are required, you may simply use the ‘kmeans’ function as it is.
The cluster centres (or centroids) are obtained after several iterations. The Euclidean distance of all the points within the cluster to the cluster centres are the minimum.
The output of ‘kmeans’ function is [idx, C, sumd, D] where D is matrix that stores Euclidean distances of all the points to cluster centres.
Since, you are using predefined number of cluster centres (k = 10), the cluster centres obtained are the best fit with minimized distances. However, this does not guarantee that distance between the points & their corresponding cluster centres reduced below 0.01.
On increasing number of cluster centres further, the distance may/may not reduce less than 0.01. As number of cluster centres reaches close to number of observation points, the Euclidean distance reaches close to 0. When, number of cluster centres = number of observation points, The Euclidean distances become 0.

Hope this helps

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

K means Clusteing with Euclidean Distace

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

K means Clusteing with Euclidean Distace

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens