Kernel Density estimation with chosen bandwidth, then normalize the density function (cdf) so that integral of cdf from min to max equal to 1 ; then take the first and second derivative of the cdf
Afficher commentaires plus anciens
I've tried using kde(data,n,MIN,MAX) and [f,xi] = ksdensity(x) over my data points.
I haven't figure out how to retrieve the cdf (density function).
I've tried using linear fit on the density data points (I got from using [density,cdf]=kde(y,1000,min(y),max(y))
but wonder if there is another method to approach finding the kernel density cdf assuming normal distribution with chosen bandwidth (standard deviation) 0.5
Thanks!
Réponses (1)
Tom Lane
le 14 Déc 2017
You seem to want to do a number of things including integrating and specifying a bandwidth. Maybe this will get you started.
Here's an example looking at a kernel density estimate from a gamma random variable and comparing it with the distribution used to generate the data.
>> x = gamrnd(2,3,1000,1);
>> X = linspace(0,40,1000);
>> f = ksdensity(x,X);
>> plot(X,gampdf(X,2,3),'r:', X,f,'b-')
Usually "cdf" is used to describe the cumulative distribution function rather than the density (pdf). Here's how to get that.
>> F = ksdensity(x,X,'Function','cdf');
>> plot(X,gamcdf(X,2,3),'r:', X,F,'b-')
5 commentaires
Tam Ho
le 20 Déc 2017
Tom Lane
le 22 Déc 2017
The density from ksdensity or gampdf is defined so that it integrates to 1 over the real line. You could, I suppose:
- Integrate it from fmin to fmax; say the result is R<1
- Set it to zero outside this range
- Divide by R to get a total integral of 1
To set the bandwidth to 0.5, type "help ksdensity" and look at the argument that defines that. You don't do anything with linspace for that.
Brendan Hamm
le 29 Déc 2017
To follow up on Tom's post:
The ksdensity function includes a Support input argument. You could not use the exact min and max for the Support, but if you extend that range out slightly it will work.
x = gamrnd(2,3,1000,1);
X = linspace(0,40,1000);
n = 1e5;
delta = 0.01; % Factor for expanding the Support
Support = [min(x)-delta,max(x)+delta];
X = linspace(Support(1),Support(2),n);
F = ksdensity(x,X,'Function','cdf','Support',Support);
You can perform a numerical integration with the trapz function:
f = ksdensity(x,X,'Function','pdf','Support',Support);
I = trapz(X,f) % As n-> Inf, I -> 1
Bandwidth is also an option, but when you provide a bounded support (as done above) a log transformation is applied to the data and the bandwidth applies on this scale. So you may need to check if the requirements are on the original scale (which I assume they are).
F = ksdensity(x,X,'Function','cdf','Support',Support,'Bandwidth',0.5);
The points for X from linspace are simply the points to evaluate the pdf/cdf and do not change the fitting which is done only on the underlying data x.
Catégories
En savoir plus sur Kernel Distribution dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
