Effacer les filtres
Effacer les filtres

How can I remove outliers in my data using Cook's Distance?

4 vues (au cours des 30 derniers jours)
Fatemah Ebrahim
Fatemah Ebrahim le 29 Juin 2020
I have a large dataset, 6 .'xlsx' files with ~ 400,000 rows each, and I want to use Cook's Distance to determine the outliers in the fourth column of each dataset and then delete the corresponding row. How would I do that?
  2 commentaires
Fatemah Ebrahim
Fatemah Ebrahim le 29 Juin 2020
Modifié(e) : Fatemah Ebrahim le 29 Juin 2020
Hi! So I'm using the code they used on one of the '.xlsx' files as so:
X = A_t; % where this is a datetime value
Y = Adata(:,4); % where we are pulling the fourth column of the table
mdl = fitlm(X,Y);
plotDiagnostics(mdl,'cookd')
find((mdl.Diagnostics.CooksDistance)>3*mean(mdl.Diagnostics.CooksDistance))
And I am getting this error:
Error using classreg.regr.TermsRegression/handleDataArgs (line 550)
Predictor variables must be numeric vectors, numeric matrices, or
categorical vectors.
Error in LinearModel.fit (line 1184)
[X,y,haveDataset,otherArgs] =
LinearModel.handleDataArgs(X,varargin{:});
Error in fitlm (line 121)
model = LinearModel.fit(X,varargin{:});
Please let me know if you have any idea how to address this error, there does not seem to be much information on this. Thanks!

Connectez-vous pour commenter.

Réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by