Draw samples using a Non-Gaussian distribution

Question

user20912 le 18 Sep 2023

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2022687-draw-samples-using-a-non-gaussian-distribution

Modifié(e) : Bruno Luong le 19 Sep 2023

Réponse acceptée : Image Analyst

Ouvrir dans MATLAB Online

Hi!

Let's say I have a vector X of 100 values. How can I draw a sample from this vector using a Non-Gaussian distribution?

Consider the following example in which I'm trying to draw 50 values from the original vector:

x = randn(100,1);
x(random('poisson',1:50))
Array indices must be positive integers or logical values.

Here, the Poisson distribution is just an example. This does not work because of the values I got from the random distribution are not always positive integers or logical values.

Any advice?

Thanks!

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

user20912 le 19 Sep 2023

Your interpretation is correct. I express myself in a poor way, my apologies.

Paul le 19 Sep 2023

So you have a discrete random variable N. Its probability distribution function is P(N = n) = p(n) for n = 1:100, and 0 otherwise. As must be the case, sum(p) = 1.

Do you know the distribution function, i.e., the values of p(n), n = 1:100?

Once you define the distribution function, you want to generate 50 samples of N. Are those samples to be generated with or without replacement? The former is straightforward based on the distribution function. The latter would involve reconditioning p(n) after each selection, but also doesn't sound too bad.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Image Analyst le 18 Sep 2023

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2022687-draw-samples-using-a-non-gaussian-distribution#answer_1313077

Let's say you have a vector of 100 numbers. The numbers could be anything -- doesn't matter, and we don't care. To make x from rand (uniform) or randn (normal distribution) just serves to confuse the issue with another, unrelated distribution. Let's just say that you have x and it is whatever it is and how you got it doesn't matter at all.

Now let's say you have a Normal distribution with a mean of 60 and a standard deviation of 10. So does that mean you would draw indexes in the range 60 +/- 10, in other words you'd pick indexes from about 50 to 70 from the vector? If not, please clarify.

OK, so now let's say you do not want a Gaussian distribution. Does that mean you don't want indexes from 50-70, but want indexes ranging from 1 to 100 (which would be a uniform distribution), or indexes drawn from some other distribution (like Poisson, log normal, Rayleigh, or some other known distribution)?

If you want to get the indexes of x from some arbitrary, but known, distribution, you can use random to get samples drawn from dozens of commonly used distribution functions. If you get floating point numbers, then scale them from 1 to 100 using rescale and then round them so they can be used as indexes. If you have some custom distribution, then you can use Inverse Transform Sampling ( https://en.wikipedia.org/wiki/Inverse_transform_sampling ) to get some numbers drawn from your custom distribution. See the attached demo for how I used inverse transform sampling to get numbers drawn from a Rayleigh distribution.

Finally, is this your homework, or just something you're curious about?

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

user20912 le 19 Sep 2023

Thanks for your answer. I'm trying to understand the central limit theorem. So esentially, what I have in mind was:

1) Generate a vector, uniform, normal or whatever. As you said, I just happen to have this vector since is not important.

2) Draw a sub-sample of this vector based on a probability density function which were not Gaussian, whatever this could be. I just select Poisson, from the random function, because only requiere one parameter and it was different than Gaussian. Maybe I used in a wrong way, honestly, I don't know.

3) Compute some parameters of this sub-sample such as mean and variance.

So, you answered exactly what I was thinking about:

If you want to get the indexes of x from some arbitrary, but known, distribution, you can use random to get samples drawn from dozens of commonly used distribution functions. If you get floating point numbers, then scale them from 1 to 100 using rescale and then round them so they can be used as indexes

Again, thanks for the help and the clarification.

Connectez-vous pour commenter.

Answer 2

Matt J le 18 Sep 2023

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2022687-draw-samples-using-a-non-gaussian-distribution#answer_1312947

Modifié(e) : Matt J le 18 Sep 2023

Ouvrir dans MATLAB Online

x = randn(100,1);
x(randperm(numel(x),50))
ans = 50×1
    0.1017
   -1.1903
   -0.6332
    0.9512
    0.9023
    0.0446
    0.0218
   -0.2012
    2.2907
   -0.4517

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

Torsten le 18 Sep 2023

Modifié(e) : Torsten le 18 Sep 2023

I've never seen examples of a distribution on a measure space that depends on a sample size of realizations. But maybe your guess is correct and the OP just wants to use "randperm".

user20912 le 19 Sep 2023

Your discussion helped me to clarify some concepts that, as Matt said, I had misunderstood.

Thanks

Connectez-vous pour commenter.

Answer 3

John D'Errico le 18 Sep 2023

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2022687-draw-samples-using-a-non-gaussian-distribution#answer_1313007

Ouvrir dans MATLAB Online

I think you don't understand random numbers, AND you don't understand indexing in MATLAB. What does this do?

random('poisson',1:50)
ans = 1×50
     2     2     2     4     8     2    10    14    10    14    11    20    10    17    14     9    20    19    22    19    20    18    23    21    27    37    28    23    35    16

It generates a list of 50 Poisson random numbers, with 50 different rate parameters, the numbers 1:50. I have absolutely no idea why you would want to do that.

And worse, some of those Poisson samples will be ZERO. Some might even be larger than 100.

Then you are trying to index into a vector of length 100 using those Poisson samples. The result is? Garbage, since you will often see an indexing error.

Finally, the resulting sample? It will have the same underlying distribution as your original vector x, since you are just sampling from the original vector.

What you mean about a non-Gaussian sample? Using a Poisson distribution there is meaningless.

I think you just want to sample from the vector x. For example, suppose you have a vector of prime numbers. They certainly are not Gaussian, or even uniform.

x = primes(100)
x = 1×25
     2     3     5     7    11    13    17    19    23    29    31    37    41    43    47    53    59    61    67    71    73    79    83    89    97

Now we can sample from that set.

nx = numel(x);

This will be a sample WITH replacement. So some elements may be replicated.

x(randi(nx,[1,10]))
ans = 1×10
    53    41    97    19     7    79    31    19    43    79

Sampling without replacement is as easy.

x(randperm(nx,10))
ans = 1×10
    71    41    17    97     7    79    23    59    11    67

Be careful, as sampling without replacement is not possible, if you want to sample more than 25 elements from a vector of length 25. (Why do I feel this should be both unecessary to say, as well as totally necessary on this forum?)

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

John D'Errico le 18 Sep 2023

Modifié(e) : John D'Errico le 18 Sep 2023

Ouvrir dans MATLAB Online

@torsten: I'm sorry, but it is NOT wrong. It does matter how you sample, of course. So I might give you credit for misunderstanding my statement.

If the index operation has no connection to the samples themselves, then a random sampling from that set does not change the distribution of the sample. That is EXACTLY what was done by the OP.

For example, suppose I have a very simple set:

x = rand(1,5);

the vector x is statistically uniformly distributed, on the interval [0,1]. Now, choose 3 elements from that set, also randomly chosen.

y = x(randperm(5,3));

The vector y WILL be just as uniformly distributed on the interval [0,1], as was the vector x. Can you do things that are not what I was talking about, but are a bit silly? Yes.

z = x(ones(1,1000));

Then z is just a random number, with uniform distribution on the interval [0,1], but then repeated 1000 times. We could split hairs about the distribution of z, but that seems meaningless to me, and was not in the spirit of what I was saying.

Or, if you create y, by selectively sampling more often from the elements of x that are less than 1/2, or something equally silly, then of course it will change things.

Torsten le 18 Sep 2023

Modifié(e) : Torsten le 18 Sep 2023

Ouvrir dans MATLAB Online

z = x(ones(1,1000));

z is dirac_x(1) distributed.

As far as I understood, that's an extreme case for what the OP aims at: x(distribution(1:50)) where "distribution" can be any distribution defined on the set {1,2,...,100}. So indices (unlike for randperm) will most probably repeat.

Connectez-vous pour commenter.

Answer 4

Bruno Luong le 19 Sep 2023

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/2022687-draw-samples-using-a-non-gaussian-distribution#answer_1313157

Modifié(e) : Bruno Luong le 19 Sep 2023

Ouvrir dans MATLAB Online

x = rand(1,10000); % whatever, randn(1,100) in your case

n = 5000; % number of drawing 50 in your case

% Set up wanted drawing probability

ix = 1:numel(x);

lambda = 60;

p = poisspdf(ix,lambda);

% generate index according to the above discrete pdf

c = cumsum([0 p]);

c = c/c(end);

rix = discretize(rand(1,n), c);

% Check graphically what the drawing pdf looks like

histogram(rix,'Normalization','pdf')

current_xlim = xlim;

% Compare with the prescribed pdf, it looks OK

hold on

plot(ix, p, '-+', 'Linewidth', 2);

xlim(current_xlim)

title('This is truncated poisson pdf')

% draw randomy from x (with replacement) with the above pdf

rx = x(rix)

rx = 1×5000

0.1844 0.4804 0.7231 0.6603 0.0501 0.2910 0.5988 0.1482 0.5782 0.9062 0.2965 0.2509 0.5782 0.4804 0.1844 0.1709 0.4665 0.2965 0.5084 0.4335 0.2752 0.1482 0.4802 0.4920 0.4804 0.6473 0.6447 0.2910 0.1844 0.6543

% how it looks like, still random like

% histogram(x)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Draw samples using a Non-Gaussian distribution

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

Réponse acceptée

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (3)

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

Draw samples using a Non-Gaussian distribution

5 commentaires Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

Réponse acceptée

1 commentaire Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

Plus de réponses (3)

10 commentaires Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

3 commentaires Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

5 commentaires
Afficher 3 commentaires plus anciensMasquer 3 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens

10 commentaires
Afficher 8 commentaires plus anciensMasquer 8 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens