Looking for something like a matrix version of randsample... [vectorization!]
7 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Dan Gianotti
le 25 Nov 2015
Commenté : Kirby Fears
le 9 Mar 2017
If I have a vector w (length n) and I want to pick a single random number between 1 and n using w as the weights, I can do something like
idx = randsample(n,1,true,w)
and I'll get a number between 1 and n with probability w(idx)/sum(w). Great.
Similarly, I have a matrix W (size N x M, where each of N,M is in the thousands or so), and I want to draw M random numbers between 1 and N, with the columns of W acting as independent weight vectors. I could obviously do
idx = zeros(N,1);
for i = 1:M
idx(i) = randsample(N,1,true,W(:,i));
end
...but I'm going to be calling this literally billions of times, so I'm looking for some efficiency.
I know that an equivalent way to think of this is to take my W matrix, normalize the columns so that they sum to one, do a cumsum on the columns, select a vector of uniform random numbers using rand(1,M), and find the first row indices where they are greater than the cumsum values, but I don't know how to do that without using a loop and find():
W_normalized = bsxfun(@rdivide,W,sum(W,1));
W_cdf = cumsum(W,1);
x = rand(1,M);
C = bsxfun(@lt,x,W_cdf);
and then the first row of each column of C with a 1 in it is my random number, but I haven't had any luck doing that in an efficient, vectorized way (I've seen this thread, but I think their conclusion to use a for-loop doesn't really seem to hold for larger matrices).
Any suggestions?
Thanks, Dan
0 commentaires
Réponse acceptée
Kirby Fears
le 25 Nov 2015
Modifié(e) : Kirby Fears
le 30 Nov 2015
Here's a pure matrix version of your bsxfun calls. It should be internally parallelized if your matrices are large.
W_normalized = W./repmat(sum(W,1),size(W,1),1);
W_cdf = cumsum(W_normalized,1);
x = rand(1,size(W,2));
C = repmat(x,size(W,1),1)<W_cdf;
You still need to find the first "true" value in C for each column to form an index.
Below I make a logical array where only the first "true" is present in each column:
idx = [C(1,:); xor(C(2:end,:),C(1:end-1,:))];
Hope this helps.
4 commentaires
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Logical dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!