how to deal with a missing value of a time series?
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I have few time series that are to be used in regression. for some of them the few first or last values are missing, how should I deal deal with this for the Matlab not to give error?
0 commentaires
Réponses (2)
Fangjun Jiang
le 17 Nov 2011
It depends on your need. You could fill it with zero, or nan. Or you can fill it with values that are interpreted from known data.
0 commentaires
Richard Willey
le 17 Nov 2011
Hi Yoshiko
The treatment of missing data is a fairly complicated topic. The choice of techniques to handle a missing data problem very much depends on how you plan to use the resulting data.
If you plan to generate a regression model from your data then your best course of action is to code the missing data points as NaNs. The regress command in Statistics Toolbox will then ignore any row that contains a missing data point.
I would strongly advise against using interpolation to substitute new values for the missing data points.
- This will impact any future analysis you do with this data and potentially bias metrics like R^2
- Using an interpolation technique for extrapolation can produce very inaccurate results
In a similar vein, coding this missing data points with any kind of numeric value (say 0 or -9999) can cause significant problems. (The regression algorithm will treat this value as a valid number)
0 commentaires
Voir également
Catégories
En savoir plus sur Linear and Nonlinear Regression dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!