paretotails function and frequency vector

1 vue (au cours des 30 derniers jours)
Rémy Bretin
Rémy Bretin le 5 Nov 2020
Hi matlab community,
I have a series of measurements, following a normal distribution, which I know that they must be greater than Xmin and lower than Xmax.
My problem is that I have 10^9 measurements, and I can’t save all of them in on array due to memory issue.
My solution is to discretize the range of value, knowing that my measurements are nd decimals accurate:
Step=10^-nd;
X=floor(Xmin*10^nd)/10^nd : step : ceil(Xmax*10^nd)/10^nd;
Therefore, for each measurement, I find the index of the closest values from X, and add +1 to the frequency vector CNT :
CNT=zeros(size(X));
For each measurement m do:
[~,i]=min(abs(X-m)); CNT(i)=CNT(i)+1;
The obtained distribution can be display by:
Edges=[X-step/2 Xmax+step/2];
figure; histogram('BinEdges',Edges,'BinCounts',CNT);
My goal is to be able to estimate the probability to measure a value greater than a threshold value Xth after 10^14 or more measurements.
For that, I would like to apply the “paretotails” function to my problem.
Unfortunately, the function doesn’t propose a way to use a frequency vector.
So I’m asking for your help, if anyone has a solution to my issue.
Thank you all in advance,
Rémy

Réponse acceptée

Aditya Patil
Aditya Patil le 20 Nov 2020
I have brought this issue to the notice of the concerned people.
If the data is time based, then you can work around the problem by downsampling it. If not, you can recreate the data by using the histogram, but by reducing the number of samples in each bin by same factor. Then you should be able to use paretotails.

Plus de réponses (0)

Catégories

En savoir plus sur Line Plots dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by