Hi everyone. I am trying to perform Raman spectral analysis using K-means clustering . I have 100 spectrums over 534 variables(in a matrix of 100 x 534).
Now I want to cluster 100 objects .How can I do so?
I am trying with this code, K= 12 found out by iteration. Now I have to find a plot of this for my data . Please help .
K=[ ];
sa=[ ];
for k=1:20
[idx c sumd]= kmeans(matrix,k);
sa= [sa sum(sumd)];
K= [K k];
end
plot(K,sa);// to find appropriate k
idx = kmeans(matrix,12);
gscatter(scoress(:,1),scoress(:,2),scoress(:,3),idx);//
now here I need to plot the data for all the columns rather than just 2 columns. How can I do so?

1 commentaire

Image Analyst
Image Analyst le 13 Juin 2020
Modifié(e) : Image Analyst le 13 Juin 2020
So you have 100 observations for each absorbance (wavenumber). The absorbance at each wavenumber are the features. And now you want 12 clusters which will classify each spectrum into one of 12 possible classes? Can you attach your matrix so we can try it?

Connectez-vous pour commenter.

 Réponse acceptée

Image Analyst
Image Analyst le 14 Juin 2020
Well this is what I got so far
clc; % Clear the command window.
fprintf('Beginning to run %s.m ...\n', mfilename);
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format short g;
format compact;
fontSize = 15;
[numbers, strings, raw] = xlsread('data matrix.xlsx');
[rows, columns] = size(numbers)
wavenumbers = numbers(:, 1);
for col = 2 : columns
thisSpectrum = numbers(:, col);
plot(wavenumbers, thisSpectrum, '-');
grid on;
hold on;
end
title('All Raman Spectra', 'FontSize', 20);
xlabel('Wavenumber', 'FontSize', 20);
ylabel('Absorbance', 'FontSize', 20);
[classNumber, classCentroid] = kmeans(numbers(:, 2:end)', 12)
% Plot each clas separately
hFig = figure;
for col = 2 : columns
thisClass = classNumber(col - 1);
thisSpectrum = numbers(:, col);
subplot(3, 4, thisClass);
plot(wavenumbers, thisSpectrum, '-');
grid on;
hold on;
caption = sprintf('Class #%d', thisClass);
title(caption, 'FontSize', 20, 'Interpreter', 'none');
xlabel('Wavenumber', 'FontSize', 20);
ylabel('Absorbance', 'FontSize', 20);
end
hFig.WindowState = 'maximized'
but I'm not really sure kmeans is what you want to do, as you can see from the spectra plotted for each class. I might talk to my spectroscopists tomorrow and see if they have any ideas. They are really world class. What do you want me to ask him or her?

6 commentaires

ananya mittal
ananya mittal le 14 Juin 2020
Modifié(e) : ananya mittal le 14 Juin 2020
Thanks a lot . Can you help further by telling how can I get the mean of each cluster ? And in each cluster how to know which column of the matrix is present?
I read from several papers that K-means analysis is used to cluster the similar spectrum together and then take the mean spectra of each cluster and compare with the reference spectra to identify the minerals present. I wanted to know if this is the right direction to proceed. Further ,if the PCA score matrix is used as input rather than raw data matrix, will it be more efficient?
Image Analyst
Image Analyst le 14 Juin 2020
The means (cluster centers) are returned as classCentroid.
ananya mittal
ananya mittal le 15 Juin 2020
I would like to clear my question again. As in different classes different no of spectrums are present. Now for each cluster I want a mean spectrum (average spectrum that is average of intensity calculated over each wavenumber). How can I do this ?
Image Analyst
Image Analyst le 15 Juin 2020
My spectroscopist say that kmeans is one way to group data. There may be better ways but they wouldn't know without more context. Why do you think there are 12 classes? Why not some other number? Would you rather compare these against some known reference spectra to see which reference spectra each one matches best?
I ran it again and this time I told kmeans to run 15 times to get the best guess at what the classes are. The new m-file is attached. Plus I told it to show how many spectra are in each class. You can see that some classes have only 1 spectra in them. Be aware that because of the random seeding nature of kmeans, the numbering of the classes could be different on each run, so class 1 from run 1 might not have the same spectra as class 1 on run 2. They might be in class 7 on run 2 instead of in class 1 again.
ananya mittal
ananya mittal le 15 Juin 2020
Okay I understood that clustering changes on every run.
Actually I am trying to find the different minerals present in the sample by applying multivariate analysis.
I choose 12 classes as this is what I obtained from elbow method.
Thanks a lot for your help .
Image Analyst
Image Analyst le 15 Juin 2020
I don't know what the elbow method is. But there are ways to have kmeans decide what the best value of k is. Some other function I think - I don't remember what it is off the top of my head. Maybe it's 12 but maybe it's not.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by