Randomly generate integers with a non-uniform distribution

5 vues (au cours des 30 derniers jours)
YouBetterWerk
YouBetterWerk le 1 Avr 2018
I am trying to generate a matrix of random integers of 1 to 4, but I would like to define the distribution rather than it being uniform. I have managed to generate a 5x5 matrix but it shows uniform distribution. Can I specify the distribution (e.g. 10% of 1s, 20% of 2s, 30$ of 3s and the rest of them 4s)? Thank you.
randi([1,4],5,5)

Réponses (1)

Walter Roberson
Walter Roberson le 1 Avr 2018
randsample() and specify a weights vector.
  7 commentaires
YouBetterWerk
YouBetterWerk le 2 Avr 2018
sorry, I dont quite understand this line.
sum( rand(arraysize) >= permute(cumsum(weights), [3 1 2]), 3) + 1;
[3 1 2] % must this be in this particular order? What does the ordering do?
, 3) + 1 % I dont quite understand this bit as well.
Walter Roberson
Walter Roberson le 2 Avr 2018
I forgot to mention that the code requires R2016b or later.
rand(arraysize) is (for example) 3 x 3, which is a 2 dimensional array. We need to compare every element in it to every element of cumsum(weights), here length 4, temporarily getting back a 3 x 3 x 4 array. To do that we reshape the 1 x 4 vector cumsum(weights) to be 1 x 1 x 4, and the easiest way to do that is to bring its size in the third dimension (1 x 4 is the same as 1 x 4 x 1) to the front to make 1 x 1 x 4. Another way of doing the permute() would be to to use
reshape(cumsum(weights), 1, 1, length(weights))
So now we have a 3 x 3 array of random values, which is also a 3 x 3 x 1 array, and we have the 1 x 1 x 4 vector of values to compare against. Those are different sizes, but with R2016b or later we can take advantage of automatic expression along the first dimension that is length 1 in one of the two operands and is not length 1 in the other operand. In this case the third dimension of the 3 x 3 (which is also 3 x 3 x 1) is length 1 in one of them but is length 4 in the other. So each element of the random values will be compared against the 4 different cumulative weights, giving a 3 x 3 x 4 result. You can also write the operation as
sum( bsxfun(@ge, rand(arraysize), reshape(cumsum(weights), 1, 1, length(weights))), 3 ) + 1
the sum() along the third dimension (the 3) returns back a 2D array, in this case 3 x 3. This gives the number of entries in the cumulative sum that the random number exceeded. 0 means that the value was smaller than the first entry in the cumulative weights table, 1 means that the value was larger than the first entry in the cumulative weights table but smaller than the second, and so on. rand() cannot exceed 1 (and cannot exactly reach 1 either) so you cannot possibly reach 4 with the sum, so you get 0 to 3 values. Add 1 to those to get 1 to 4, which are the indices of the entries to look up.

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by