'pca' vs 'svd' or 'eig' functions

12 vues (au cours des 30 derniers jours)
Pranav Aggarwal
Pranav Aggarwal le 16 Mar 2021
Commenté : Pranav Aggarwal le 18 Mar 2021
Hi,
I am trying to generate the principal components from a set of data. However, i get an entirely different result when i use the 'pca' function compared to the 'eig' function. The 'eig' function gives the same results as the 'svd' function for my data.
I am using the raw data as input into the 'pca' function.
For 'eig' - I am calculating the correlation matrix and then using that as input into the 'eig' function.
I am very puzzled on why i get different results and would be grateful for your help! Code below:
testmat = rand(20,5);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]

Réponse acceptée

the cyclist
the cyclist le 16 Mar 2021
You will get the same result from pca() if you standardize the input data first:
rng default
testmat = rand(20,5);
% Standardize the data
testmat = (testmat - mean(testmat))./std(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
[sort(testsvd), sort(testeig), sort(testlatent)]
ans = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
  2 commentaires
Steven Lord
Steven Lord le 16 Mar 2021
To normalize the data you can use the normalize function to normalize by 'zscore' (which is the default normalization method.)
rng default
testmat = rand(20,5);
% Standardize the data
testmat = normalize(testmat);
testcorrelMat = corr(testmat);
testeig = eig(testcorrelMat);
testsvd = svd(testcorrelMat);
[testcoeff, ~, testlatent] = pca(testmat);
results = [sort(testsvd), sort(testeig), sort(testlatent)]
results = 5×3
0.2238 0.2238 0.2238 0.6422 0.6422 0.6422 0.8504 0.8504 0.8504 1.4606 1.4606 1.4606 1.8229 1.8229 1.8229
format longg
results - results(:, 1)
ans = 5×3
0 1.11022302462516e-16 -1.94289029309402e-16 0 4.44089209850063e-16 -9.99200722162641e-16 0 -1.11022302462516e-16 3.33066907387547e-16 0 -1.33226762955019e-15 -1.55431223447522e-15 0 0 -8.88178419700125e-16
Looks pretty good to me.
Pranav Aggarwal
Pranav Aggarwal le 18 Mar 2021
Thanks Steven and 'the cyclist' - solved!

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Dimensionality Reduction and Feature Extraction dans Help Center et File Exchange

Tags

Produits


Version

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by