Messy data, finding an outlier as a value greater of less than the previous value and replacing with Nan

Dear Matlabcentral,
I have a matrix I have imported from excel. I want to find and replace values in the matrix that show a greater or less than +0.5 difference from the previous value in the column.
For example if I consider a single column as a vector;
1.3 1.4 1.3 1.3 1.9 1.3 1.4 ..
It is clear from looking at the data that the value 1.9 is anomalous and I would like to replace it with a Nan. Is there a quick function that samples the nearest neighbours and identifies values that are 0.5 greater or less than their nearest neighbours?
Many thanks

 Réponse acceptée

X = [1.3, 1.4, 1.3, 1.3, 1.9, 1.3, 1.4];
index = (abs(diff(X)) >= 0.5);
invalid = ans([index, false], [false, index])
X(invalid) = NaN;
Or:
invalid = strfind(index, [true, true]) + 1;
The problem remains tricky. What do you expect for:
X = [1,0,1,0,1,0,1,0] ?

Plus de réponses (2)

You might look at these outlier identification pages: deleteoutliers, and Median Absolute Deviation for some more "standard" definitions of outliers.
X = [1.3 1.4 1.3 1.3 1.9 1.3 1.4];
X(find(diff(X) > 0.5) + 1) = NaN;

2 commentaires

This considers only a positive step, so at least an abs() should be added. In addition I assume that "than their nearest neighbours" means, that both neighbors must have this distance, but this should be explained by the OP explicitly.
As well as explaining his totally contradictory explanation of "columns". I believe his example "single column" is actually NOT a column, but a "single ROW." Otherwise the rest of his explanation, both before and after, doesn't make sense at all. If his "single column" truly is a column, then his suggested algorithm is wrong because it's acting on the actual column instead of the previous column like he wanted.

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by