Using Mahalanobis distance in hierarchical cluster analysis error
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi! Thank you in advance for the help! I am currently creating a hierarchical cluser using the linkage function in Matlab. I pass the following argument into the function:
links = linkage(samples,'complete', 'mahalanobis');
My variable, samples, is a 25 x 106720 matrix, class double, that contains t values.
Every time I run this in Matlab however, it gives me the following error message:
Error using *
Requested 106720x106720 (84.9GB) array exceeds maximum array size preference. Creation of arrays greater than this limit
may take a long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more
information.
Error in nancov>localcov (line 173)
c = xc' * xc / denom;
Error in nancov (line 116)
c = localcov(x,domle);
Error in pdist (line 181)
additionalArg = nancov(X);
Error in linkage (line 259)
Z = internal.stats.linkagemex(Y,method,pdistArg, memEff);
How do I bypass this error/ is there another way for me to calculate the mahalanobis distance for hierarchical clustering?
0 commentaires
Réponses (1)
Rajani Mishra
le 11 Mar 2020
The error encountered is because for your data “samples” of size 25 x 106720 when covariance matrix is computed in linkage function using “nancov()” the size grows to 106720 x 106720 which exceeds maximum array size preference.
You can try either reducing your data size by dimensionality reduction. I encountered literature talking about the same when researching about your question. You can also refer to literature regarding this. You can use function “pca()” for dimensionality reduction. Please refer to the following link to learn more about “pca()” : https://www.mathworks.com/help/stats/pca.html
Or, you can use tall arrays for storing data for hierarchical clustering. Tall arrays are designed for working with out-of-memory data. For more information refer : https://www.mathworks.com/help/stats/examples/statistics-and-machine-learning-with-big-data-using-tall-arrays.html
Voir également
Catégories
En savoir plus sur Dimensionality Reduction and Feature Extraction dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!