about high dimension and low sample using PCA

1 vue (au cours des 30 derniers jours)
Yimin Chen
Yimin Chen le 8 Nov 2016
Réponse apportée : Aditya le 27 Juin 2024
I am using PCA to detect the abnormality in time-series data. Currently I have high dimension and low sample dataset (15*530 data matrix). I am wondering if I can use PCA to obtain the statistic such as T^2 and SPE. I noticed that some articles stated that it is improper to use PCA to obtain the statistics under such case.

Réponses (1)

Aditya
Aditya le 27 Juin 2024
Using PCA to detect abnormalities in time-series data, especially with a high-dimensional and low-sample dataset, can be challenging. The primary concern is that PCA may not provide reliable results when the number of features (dimensions) significantly exceeds the number of samples. This is because PCA relies on the covariance matrix, which can be poorly estimated in such scenarios.
% Simulate high-dimensional, low-sample data
rng(0);
DATASET = rand(15, 530);
% Apply PCA
[coeff, score, latent] = pca(DATASET);
% Calculate T² statistic
T2 = sum((score ./ sqrt(latent')).^2, 2);
% Calculate SPE (Q-statistic)
reconstructed = score * coeff';
SPE = sum((DATASET - reconstructed).^2, 2);
% Set threshold for T² and SPE (e.g., 95% confidence level)
alpha = 0.05;
T2_threshold = chi2inv(1 - alpha, size(coeff, 2));
SPE_threshold = prctile(SPE, 95);
% Detect abnormalities
abnormal_T2 = T2 > T2_threshold;
abnormal_SPE = SPE > SPE_threshold;
disp('Abnormalities detected by T²:');
disp(abnormal_T2);
disp('Abnormalities detected by SPE:');
disp(abnormal_SPE);

Catégories

En savoir plus sur Dimensionality Reduction and Feature Extraction dans Help Center et File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by