From a pdf to a histogram.

Hi. I have the two parameters of the lognormal so i can plot the pdf. If i want to convert the density into a histogram i should calculate the integral under the curve associated to bins of a certain width, right? I know mu and sigma of the pdf and nothing else. How can i extract values of the density associated to bins of width of 100? My x goes from 0 to 250000 approx. I cannot calculate manually 2500 integrals to build the histogram.

4 commentaires

José-Luis
José-Luis le 8 Août 2017
The histogram is an approximation of the pdf. Why settle for that?
Alfonso Russo
Alfonso Russo le 8 Août 2017
Modifié(e) : Alfonso Russo le 8 Août 2017
Because "discretising" the pdf i can sum bins coming from different lognormals. My general target is to integrate many lognormals that describe the distribution of income in several countries. For these country-individuals i know only the two distributional parameters mu and sigma and so i can plot the pdf. I do not have any other data. One strategy would be estimating a mixture of the distributions as the "population-weighted sum of the subgroup densities". The problem is that i have no idea about how to sum PDFs. My supervisor suggested to "discretise" the distributions so that i can sum all the densities associated to, say, the bins 0-99.9$ for all the countries in Europe and obtain the total european density of people with annual income between 0 and 99.9$. Repeating this procedure for all the bins i will obtain a distribution for Europe as a whole. Makes sense?
Pawel Tokarczuk
Pawel Tokarczuk le 9 Août 2017
"The sum of two pdf's is their convolution."
No; if 2 variables, x and y, have pdf's p(x) and q(y), then the convolution of p and q is the pdf of the sum (x + y). The (weighted and normalized) sum of 2 pdf's is a mixture.
José-Luis
José-Luis le 9 Août 2017
Modifié(e) : José-Luis le 9 Août 2017
"The sum of two pdf's is their convolution".
Erroneous comment was mine.
Pawel is right. I don't know how one goes about mixing two pdf's but there is some literature on it.
That reference is pretty old so there might be newer (hopefully better) methods. Some fitting is required.

Connectez-vous pour commenter.

Réponses (3)

David Gonzalez
David Gonzalez le 7 Août 2017

0 votes

Hi,
You can use the function lognrnd(mu,sigma) to sample the lognormal density without computing the integral. Then, you can use hist(x,nbins) to create the histogram. Here is an example:
hist(lognrnd(0,1,[3000,1]),100) % lognormal histogram with mu=0, sigma=1, 30000 samples and 100 bins
David

2 commentaires

Alfonso Russo
Alfonso Russo le 7 Août 2017
it will only plot the histogram. i need the values associated to each bin since i have several distributions that i need to add together. Is there a way to extract the values associated to each bin in order to add up 0-100 bins coming from different lognormals?
Alfonso Russo
Alfonso Russo le 8 Août 2017
Moreover, if my lognormal described a random variable X across say 322000000 observation and i wanted to specify only the bins' width and not the number of bins (so that MatLab can show how many bins are created) how should i modify the code?

Connectez-vous pour commenter.

Sean de Wolski
Sean de Wolski le 7 Août 2017

0 votes

Look at the 'Normalization' property of histcounts.

2 commentaires

Alfonso Russo
Alfonso Russo le 8 Août 2017
It does not work. I need exactly the densities associated to approx 2500 bins of width = 100.
Alfonso Russo
Alfonso Russo le 8 Août 2017
Modifié(e) : Alfonso Russo le 8 Août 2017
There is something that might work since histcounts allows me to use 'Normalization' and 'Countdensity' as properties. How can i set my variable X as coming from a lognormal with defined mu and sigma and divide it into bins of width of 100 so that i can use histcounts(X, 'Normalization', 'countdesity') ?

Connectez-vous pour commenter.

Torsten
Torsten le 9 Août 2017

0 votes

I don't understand the problem you have.
The probability P that a person in Europe has income x is
P(income=x)=sum_{i=1}^{N} P(income=x|person comes from country i)*P(person comes from country i)
where N is the number of countries you want to take into consideration.
Thus the aggregated probability density function is
d_aggregated(x)=sum_{i=1}^{N} w_i * d_i(x)
where w_i is the number of people in country i divided by the total number of people in the countries under consideraton and d_i(x) is the pdf for the incomes in country i.
Best wishes
Torsten.

2 commentaires

Imagine that i have two lognormals L1 and L2 with parameters (m1,s1) and (m2,s2) and weights w1=0.6 and w2=0.4. To obtain the aggregated PDF i should tipe on matlab:
d_aggregated(x)= 0.6 * 1/(s1*x*sqrt(2*pi))*exp(- ((log(x)- m1).^2))/(2*(s1.^2)) + 0.4 * 1/(s2*x*sqrt(2*pi))*exp( - ((log(x) - m2).^2))/(2*(s2.^2))
% obtaining the aggregated PDF.
Is it correct? Once calculated, is there a way of plot it? Like plot(d_aggregated(x))?
Torsten
Torsten le 9 Août 2017
m1 = ...;
m2 = ...;
s1 = ...;
s2 = ...;
x = 0:0.02:10;
y = 0.6*lognpdf(x,m1,s1)+0.4*lognpdf(x,m2,s2);
plot(x,y)
Best wishes
Torsten.

Connectez-vous pour commenter.

Commenté :

le 9 Août 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by