Because the cluster data is 24-dimensional, it is often difficult to visualize them directly. A common way to deal with this is to first project or transform the data to lower dimensions (typically 2 or 3) and then apply visualization techniques to the reduced-dimensional data. As an example, suppose the "kmeans" function is applied to a data matrix "data" (300 x 24) with the number of clusters being set to 3:
rng("default"); data = randn(300, 24); [idx, C] = kmeans(data, 3);
Then here are some visualization options:
Option 1: Plot 2 or 3 dimensions of your interest. For instance, to plot the 4th dimension versus the 9th dimension of your data, one can do the following
scatter(data(:,4), data(:,9), [], idx); % plot three clusters with different colors hold on; plot(C(:, 4), C(:, 9), 'kx'); % plot centroids
Option 2: First reduce the dimensionality of your data using principal component analysis (PCA), and then plot the data in the principal-component space:
[standard_data, mu, sigma] = zscore(data); % standardize data so that the mean is 0 and the variance is 1 for each variable [coeff, score, ~] = pca(standard_data); % perform PCA new_C = (C-mu)./sigma*coeff; % apply the PCA transformation to the centroid data scatter(score(:, 1), score(:, 2), [], idx) % plot 2 principal components of the cluster data (three clusters are shown in different colors) hold on plot(new_C(:, 1), new_C(:, 2), 'kx') % plot 2 principal components of the centroid data
Option 3: Use "silhouette" function to measure the goodness of the clustering:
silhouette(data, idx);