Randomly select values from a vector that yield a target mean

I have a specific list of values, which can be arrayed in a vector: AR = [-.5, -.3, -.1, 0, .1, .3, .5] (note that it is not important for the values to be arranged in a vector necessarily)
I want to specify a target mean. E.g.: TM = -.1
Then, I want ML to randomly select any N values (with replacement) from vector AR, as long as those N values yield the target mean. For example,
AR = [-.5, -.3, -.1, 0, .1, .3, .5] TM = -.1 N = 4
*magic code goes here, and ML gives:
Output = [-.3, .1, -.5, .3]
N randomly selected values from a given set with a mean of TM. Any ideas how to achieve this?

2 commentaires

In general and most cases, this won't be possible (e.g., AR = [1 2 6], TM = 3, N= 2), unless you have some additional constraints on the inputs AR, TM, and N. Do you know these constraints?
Please don't add answers just to make a comment. Moved from an answer:
"Oh yeah - this will be part of an experiment with hundreds of trials. AR will be constant across all of them.
On each trial, I will specify what TM and N are for that trial, and I will never specify impossible values. In most cases, N = 4. In fewer cases, N = 6. But in all cases, the a given trial's ™ will be computable given AR and N."

Connectez-vous pour commenter.

 Réponse acceptée

nAR = length(AR);
[IDX{1:N}] = ndgrid(1:nAR);
IDXmat = cell2mat(cellfun(@(M) M(:), IDX, 'uniform', 0));
line_totals = sum(AR(IDXmat), 2);
mask = ismembertol(line_totals, TM);
matchingIDX = IDXmat(mask,:);
randchoice = randi(size(matchingIDX,1));
rand_ARentries = matchingIDX(randchoice, :);
You indicate that AR is constant. The IDXmat and line_totals can be pre-calculated.
The mask and matchingIDX can only be calculated once TM is known.
Once the combinations that match the required total are known, then one of them is chosen at random. If you are doing a number of trials with the same AR and same TM, then you only have to do the randchoice and rand_ARentries once per trial. If you want T random trials then
randchoices = randi(size(matchingIDX,1), T, 1);
rand_ARentries = matchingIDX(randchoices, :);
will select all T of them simultaneously.
If length(AR) is high then this code can have heavy memory use. But once everything is set up, the cost of generating a new random combination is quite small.

10 commentaires

For a limited size problem, this solves it the way I want to solve it - i.e., by generating all possible selections, then choosing randomly from those that satisfy the sum constraint. The caveat is that as AR gets large, this solution becomes intractable. But that simply means the problem itself will become intractable.
For a given TM, there are ways of saving memory but still being somewhat efficient on this.
The ways could be improved even more if none of the AR entries were negative, but based upon the examples it appears that negative is permitted.
Yeah, I understand. Thank you all.
Unfortunately Walter Robinson's code did not work. I receive the following errors:
_Undefined function 'makecol' for input arguments of type 'double'.
Error in ismembertol (line 10) equiv = abs(bsxfun(@minus, makecol(a), makerow(b))) < tol;
Error in AR_KnownMean (line 5) mask = ismembertol(line_totals, TM);_
Maybe I can simplify my request. If I can obtain a matrix in which each row sums to a target sum, that would be sufficient.
e.g.,
AR = [-.4, -.3, -.2, 0, .2, .3, .4]
N = 4;
TargetSUM = .4;
% Smart person's code goes here
Target_Sum_By_Row =
.4 0 0 0
0.4 0 0
0 0 .4 0
0 0 0 .4
.2 .2 0 0
0 .2 .2 0
etc.
Again, "AR" is a specified vector of values. "N" is the number of values, pulled from AR with replacement, that must sum to "TargetSUM". The desired outcome is a matrix in which each row, N columns wide, sums to TargetSUM. I hope that's clear!
You appear to be using some third party ismembertol. What shows up for
which -all ismembertol
In your sample output the entries with 0.1 are not valid as that does not occur in AR.
If the entries in AR are fixed distances apart there might possibly be an improvement available in the algorithm. If it is not necessary to choose from AR and you just need N random values with the given sum then improvements are available for sure.
Yes, I was using a 3rd party ismembertol. Otherwise, I don't have any function called ismembertol.
And, in my sample output the rows with .1 were mistakes. It must occur in AR, you are right.
Your third party ismembertol relies upon functions that you did not install.
Replace
mask = ismembertol(line_totals, TM);
with
mask = abs(line_totals - TM) <= 1e-5;
ismembertol was introduced in R2015a. When you are using an older version it is always a good idea to mention that so that the volunteers do not give solutions based upon functions you do not have.
ok, thank you.
this works! thank you Walter.

Connectez-vous pour commenter.

Plus de réponses (1)

Jeff Miller
Jeff Miller le 1 Mar 2018
Well, this is pretty ugly, but you could just repeatedly sample N numbers from AR randomly, stopping when they give you TM. That might take a while, but maybe you could find the numbers giving the desired (N,TM) pairs in advance (i.e., have them all ready when it was time to start your experiment).

Catégories

En savoir plus sur Loops and Conditional Statements dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by