How to remove extra value from histogram In MATLAB
Afficher commentaires plus anciens
Hello everyone, I hope you are doing well.
I have the following dataset in which i have a pattern, there are some values which are the outliers or you can say the missing values which occur in different place. i want to remove the values using histogram.
i have compute the histogram of the data as you can see in image untitled.jpg. There are three values 4800 5130 5540 which have histogram value of 322, 317 and 312 while the other have value less then 50.
I want to keep the Values and Indexes of 50% (in above case 161) of maximum value of histogram and remove the remaining values.
I have write the following code. But it just return a single value not the original matrix (4800 5130 5540)
Can any body help me in that Please
h=histogram(Values)
sumofbins=max(h.Values);
size_MP=round(50/100*sumofbins);
ValueofHistogram= h.Values;
Bindata=h.Data
for i=1: length(ValueofHistogram)
if ValueofHistogram(i)<size_MP;
Bindata(i)=0;
end
end
1 commentaire
Med Future
le 17 Avr 2022
Réponse acceptée
Plus de réponses (1)
You can't change the 'Values' property of a histogram directly, but you can change its underlying 'Data'. In this case, you can remove data from within those bins whose Value is less than half the maximum Value:
load('His.mat')
h=histogram(Values)
sumofbins=max(h.Values);
size_MP=round(50/100*sumofbins);
ValueofHistogram= h.Values;
Bindata=h.Data;
Binedges=h.BinEdges;
Binedges(end) = Inf;
for i=1: length(ValueofHistogram)
if ValueofHistogram(i)<size_MP;
Bindata(Bindata >= Binedges(i) & Bindata < Binedges(i+1)) = [];
end
end
xl = xlim();
h.Data = Bindata;
xlim(xl); % restore axes xlim, if you want to
12 commentaires
Med Future
le 17 Avr 2022
Yes:
load('His.mat')
h=histogram(Values);
sumofbins=max(h.Values);
size_MP=round(50/100*sumofbins);
ValueofHistogram= h.Values;
Bindata=h.Data;
Binedges=h.BinEdges;
Binedges(end) = Inf;
deleted_data_idx = false(size(Bindata));
for i=1: length(ValueofHistogram)
if ValueofHistogram(i)<size_MP;
deleted_data_idx(Bindata >= Binedges(i) & Bindata < Binedges(i+1)) = true;
end
end
Bindata(deleted_data_idx) = [];
xl = xlim();
h.Data = Bindata;
xlim(xl); % restore axes xlim, if you want to
disp(find(deleted_data_idx));
Med Future
le 17 Avr 2022
Med Future
le 18 Avr 2022
Voss
le 18 Avr 2022
@Med Future Sorry I misunderstood the request. I have modified my comment above to show the indices of the deleted data.
Stephen john
le 18 Avr 2022
@_ i have attached the dataset,
As you can see the value of 4340 is not remove . Can you modified the above code?
A histogram of all data in newone:
S = load('newone.mat');
Values = S.ans;
h=histogram(Values);
I want to check if 4340 is in there:
find(Values == 4340)
It is.
Now, applying the above code to newone:
h=histogram(Values);
sumofbins=max(h.Values);
size_MP=round(50/100*sumofbins);
ValueofHistogram= h.Values;
Bindata=h.Data;
Binedges=h.BinEdges;
Binedges(end) = Inf;
deleted_data_idx = false(size(Bindata));
for i=1: length(ValueofHistogram)
if ValueofHistogram(i)<size_MP;
deleted_data_idx(Bindata >= Binedges(i) & Bindata < Binedges(i+1)) = true;
end
end
Bindata(deleted_data_idx) = [];
xl = xlim();
h.Data = Bindata;
xlim(xl); % restore axes xlim, if you want to
disp(find(deleted_data_idx));
4340 is still in there:
find(Bindata == 4340)
@Stephen john: Here is a question for you: Why would I modify the code to remove 4340, when the specification was to remove data that falls into bins whose height is less than 50% of the maximum bin height? As you can see, 4340 is in the tallest bin. Are you changing the specification of the question?
Stephen john
le 18 Avr 2022
Stephen john
le 18 Avr 2022
@_ in newone data your code is fine working but the bin width is 4000 to 5000 so the values which should be removed are still remaining at 4340.
Voss
le 18 Avr 2022
@Stephen john Why should the values at 4340 in newone.mat be removed? Please explain how to determine whether a data point is removed or not removed, in general.
I thought the process was to remove points that fall into bins smaller than 50% of the largest bin. The bin from 4000 to 5000 is the largest bin, so its data is not removed. If data points of value 4340 should in fact be removed, then obviously I am not understanding what the process should be. Please advise.
Med Future
le 19 Avr 2022
Image Analyst
le 19 Avr 2022
Try my well commented last comment below, at the end of my answer. I think it will do what you want now. It gives you the indexes in your original data where the counts are less than 50% of the max count. It then uses those indexes to delete those infrequently occurring data from the original data set.
Catégories
En savoir plus sur Matrix Indexing dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




