How to remove multiple outlier data in a rectangular maze?

I have this rodent x,y coordinates as described in the attached picture. There are some outliers in the data, I need to remove them. I have to dynamically define a zone based on the extreme coordinates in the quardrants.
For example, let's say I need to remove the outlier data circled in red. The datapoint is in Maze4. I have attached the data for Maze4. I want to remove the bins where histcounts2 is < 2. I also need the 'xcoordinates2' and 'ycoordinates2' array after cleaning the outliers. I tried this so far.
h4 = histogram2(Maze4.xcoordinates2, Maze4.ycoordinates2, ...
nbins,'DisplayStyle','tile','ShowEmptyBins','on');
counts4 = histcounts2(Maze4.xcoordinates2, Maze4.ycoordinates2, 25);
index4 = h4.Values(counts4<2);
But the index4 gives me 1-D array. How do I solve this?

 Réponse acceptée

First try to avoid the problem by not letting your rodents escape from your maze! 🐭🐹🐁🐀🤣
Then if you know your x coordinates must be between -80 and 80, use masking to extract only those good indexes:
% Find out which indexes are outside the -80 to 80 range.
mask = abs(x <= 80);
% Extract only those good indexes.
x = x(mask);
y = y(mask);

5 commentaires

Thanks for your response. I would but I don't do the experiments... haha.
I don't know the good limiting points of the x or y coordinates. They may change depending upon the camera position on that day. I have seen the x values ranging from 60 to 600 depending on the trials. So, I can not hard code, instead I am planning to remove outliers then define the limiting points based on max and min x.
I hope that makes sense.
Take a histogram of all your x values and plot them. Then look for bins that have more than a certain count in them.
If you have any more questions, then attach your data and code to read it in with the paperclip icon after you read this:
Thanks for the insight. I have edited my question and attached a sample data. Sorry for being vague earlier.
How about this, where you take all x data between the 5% and 95% points?
s = load('maze4.mat')
% Extract the table
t = s.Maze4
xOriginal = t.xcoordinates2;
yOriginal = t.ycoordinates2;
subplot(2, 2, 1)
plot(xOriginal, yOriginal, 'r.', 'MarkerSize', 8)
grid on
title('Original Data')
subplot(2, 2, 2)
[counts, edges] = histcounts(xOriginal, 100)
bar(edges(1:end-1), counts, 1)
grid on;
title('Histogram of X values')
theCDF = rescale(cumsum(counts) / sum(counts), 0, 100);
subplot(2, 2, 4)
plot(edges(1:end-1), theCDF, 'b-', 'LineWidth', 2)
title('CDF of Histogram')
grid on;
% Find the index of the 5% and 95% points
index1 = find(theCDF > 5, 1, 'first')
x1 = edges(index1)
xline(edges(index1), 'LineWidth', 2, 'Color', 'r')
% Find the index of the 5% and 95% points
index2 = find(theCDF > 95, 1, 'first')
x2 = edges(index2)
xline(edges(index2), 'LineWidth', 2, 'Color', 'r')
% Get a mask for the x values we want to exclude
indexesToKeep = t.xcoordinates2 >= x1 & t.xcoordinates2 <= x2;
xCleaned = xOriginal(indexesToKeep);
yCleaned = yOriginal(indexesToKeep);
subplot(2, 2, 3)
plot(xCleaned, yCleaned, 'r.', 'MarkerSize', 8)
grid on
title('Cleaned Data')
Or else you can use a clustering algorithm like dbscan. Demo attached.
Very elegant! Thank you so much!

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur 2-D and 3-D Plots dans Centre d'aide et File Exchange

Produits

Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by