How do we Calculate Distance Matrix for Data Set in an Excel file

Dear experiences...
i have a dataset D which includes N points in M dimensional space, data set stored in an excel file called (data.xls)
in this excel file ... (data view)
  • first column (A) is my data points name (x1,x2,...xn),
  • columns from (B ... M) are features where features are weighted according to some calculation.. where n in may data set = 127 and m=1200.
  • Xij includes features weights for (i=1..n), (j=1..m)
*i need to calculate distance matrix based on (cosine distance)..where procedure i think its look like the following:
1- every row of Xi (data-point) is normalized to be (unite length=1) independent from others .. where the result matrix is includes normalized data points.
2- after that distance matrix applied based on cosine distance where cosine distance (i think) = 1-cosine similarity (dot product) .
i would thank any one can give me a help to import dataset in matlab and perform my requirements.. due i'm new to matlab?

 Réponse acceptée

Guillaume
Guillaume le 13 Mar 2017
Modifié(e) : Guillaume le 13 Mar 2017
1. Normalising the rows is easy:
NormalisedMatrix = OriginalMatrix ./ sqrt(sum(NormalisedMatrix .^ 2, 2));
2. Getting the cosine similarity is also fairly simple. I#m using the formula in this wikipedia article:
cossimilarity = @(a, b) sum(a.*b, 2) ./ sqrt(sum(a.^2, 2) .* sum(b.^2, 2));
similarity = squeeze(cossimilarity(OriginalMatrix, permute(OriginalMatrix, [3 2 1]))); %assumes R2016b
cosdistance = 1 - similarity;
The above gives you a NxN symmetric matrix of the similarity and distance between each vector.

8 commentaires

ahmed obaid
ahmed obaid le 13 Mar 2017
Modifié(e) : ahmed obaid le 13 Mar 2017
thank you very much for your participation, can i ask you please in the second part code what the meaning of this please [3 2 1],
what is the inputs for first code .. original matrix is A.
is this code normalize every row to unite length =1, which every row independent from other row ( in other words, every row normalized independently)
i have using matlab 2015a thanks
The normalisation to unit length is in 1. There is no normalisation in 2. since it has zero effect on the result. The cosine similarity is independent of the length of the vectors.
permute(x, [3 2 1]) moves the rows in the 3rd dimension. This allows you to calculate the cosine similarity of the rows against all the other rows all at once.
ok, to be simplify for me suppose my matrix is A which contain the following data : A=[1 2 0;4 5 0;7 0 0], where every row is a vector, can you please perform on this to be clear for me? i'm not understand your code to be apply in my issue. thanks for your help
I'm not sure what is difficult to understand:
1. Normalisation
A = [1 2 0;4 5 0;7 0 0]
normalisedA = A ./ sqrt(sum(A.^2, 2))
2. cosine similarity
cossimilarity = @(a, b) sum(a.*b, 2) ./ sqrt(sum(a.^2, 2) .* sum(b.^2, 2));
similarity = squeeze(cossimilarity(A, permute(A, [3 2 1]))); %assumes R2016b
cosdistance = 1 - similarity;
please find the following:
Error using ./
Matrix dimensions must agree.
and for second part Error using .*
Matrix dimensions must agree.
Error in @(a,b)sum(a.*b,2)./sqrt(sum(a.^2,2).*sum(b.^2,2))
similarity = squeeze(cossimilarity(A, permute(A, [3 2 1]))); %assumes R2016b
%see the comment there? ----------------------------------------------^
You're not using R2016b. In previous versions:
normalisedA = bsxfun(@rdivide, A, sqrt(sum(A.^2, 2));
similarity = squeeze(bsxfun(cossimilarity, A, permute(A, [3 2 1])));
thank you for your long patient, with my thankful, can you please update this code to be work under 2015a , because i can't release my version as well as i don't have 2016 copy.
"can you please update this code to be work under 2015a"
I wrote, just above, "In previous versions:", followed by the two lines that need to be replaced to work in all versions before R2016b, including R2015a.

Connectez-vous pour commenter.

Plus de réponses (0)

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by