Filtering and Cleaning Data
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Dears Friends,
How can I clean this data?
Anyone have some a sugestion for me?
2 commentaires
Mohammad Sami
le 26 Mar 2020
If you are running R2019b and later, try the interactive cleaning task in the live editor.
Réponses (3)
Peng Li
le 26 Mar 2020
Technically, this is not a programming issue. rather, this is an issue about algorithm. It's all depending on what you meant by cleaning. Do you think the spikes are what you want to filtered out? Or do you want to do something else? If the spikes are what you think that shouldn't be filtered out, the simplest way to clean this is by a so called three sigma criterion--anything that is beyong mean+/-3*standard deviation is believed to be outliers. There are other tricks too. So, again, this is about the algorithm not about programming I believe.
Peng Li
le 26 Mar 2020
A simple work around:
b = DADOSUFCS2(:, 2);
bstd = movstd(b, 100);
thre = nanmean(bstd);
bnew = b(bstd <= thre);
3 commentaires
Peng Li
le 26 Mar 2020
Sorry it's difficult for me to understand what you are trying to ask. What I provided is a simple algorithm based on moving standard deviation. anything whose corresponding moving standard deviation is above a threshold will be treated as outliers in my example.
Peng Li
le 26 Mar 2020
How do you know that they are not real? Do you have a specific criterion? If you have, then it is simple. If you don't, you may need to work out a bit more algorithm side as no algorithm is the best for filtering a general data set. You are the best person who knows your data the best.
Voir également
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!