Generating a combination matrix within a certain condition

Question

Fabrício le 16 Sep 2022

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/1806800-generating-a-combination-matrix-within-a-certain-condition

Modifié(e) : Fabrício le 19 Sep 2022

I have a cell array with 1 to 10 vectors, with 1 to 40 numbers in each of them (all positive integers lower than 60), and I have to combine one value from each vector to a row in a matrix to produce a combination per column. I'm using combvec(cellarray{:}) to make the matrix but it consumes too much memory. I do have a condition that, for an exemple, reduces a matrix with 8892490 columns to 15744 but I can only use those conditions after assigning the matrix. The real condition is similar to a simple one like booleancomb = sum(matrix,1)<30 .

I tried generating the matrix to a matfile and then using the condition to get a logical array to the columns I actually want but I can't get only those columns out of a matfile...I also tried to change my data to single or uint but combvec always gives me double values.

The main thing I'm searching for is a less memory consuming way to manage this, if it uses others resources that would be fine.

I'm new to matlab so if you have any thoughts I would really appreciate!

% Vectors to be combined
cellarray{1} = [0 1 2 3];
cellarray{2} = [0 1 2];
combinations = combvec(cellarray{:}); % Uses too much memory
booleancomb = sum(combinations,1)<3; % Condition
disp(combinations(:,booleancomb))
     0     1     2     0     1     0
     0     0     0     1     1     2

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Bruno Luong le 16 Sep 2022

2
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1806800-generating-a-combination-matrix-within-a-certain-condition#answer_1054990

Modifié(e) : Bruno Luong le 16 Sep 2022

Ouvrir dans MATLAB Online

I hope it works. My output is transposed, meaning the number of columns of comb is number of sets.

The maxsum is upper bound of sum(comb,2)

clear
testbigcase = false;
if testbigcase
    cellarray = cell(1,10);
    for k=1:length(cellarray)
        cellarray{k} = randi(10,1,3+randi(3));
    end
    smax = 30;
else
    cellarray{1} = [0 1 2 3];
    cellarray{2} = [0 1 2];
    smax = 2;
end
comb = combwithhisum(smax, cellarray{:});
if ~testbigcase
    disp(comb)
end
     0     0
     0     1
     1     0
     0     2
     1     1
     2     0
fprintf('find %d/%d combinations\n', size(comb,1), prod(cellfun(@numel, cellarray)))
find 6/12 combinations
%%
function comb = combwithhisum(smax, varargin)
comb = [];
if ~isempty(varargin)
    c1 = reshape(varargin{1}, 1, []);
    c1 = c1(c1 <= smax);
    if isempty(c1)
        return
    end
    c1 = sort(c1);
    if length(varargin) > 1
        c = combwithhisum(smax-c1(1), varargin{2:end});
        if ~isempty(c)
            s = [sum(c,2); Inf];
            for v=c1
                last = discretize(smax-v, s);
                if last>=1
                    comb = vertcat(comb, [repelem(v,last,1), c(1:last,:)]); %#ok
                end
            end
            [~,is] = sort(sum(comb,2));
            comb = comb(is,:);
        end
    else
        comb = reshape(c1, [], 1);
    end
end
end

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Bruno Luong le 17 Sep 2022

Modifié(e) : Bruno Luong le 17 Sep 2022

Ouvrir dans MATLAB Online

Here is a sightly optimized version (sort only once, discretize outside for-loop)

clear
testbigcase = true;
if testbigcase
    cellarray = cell(1,10);
    for k=1:length(cellarray)
        cellarray{k} = randi(10,1,3+randi(3),'uint8');
    end
    smax = 30;
else
    cellarray{1} = [0 1 2 3];
    cellarray{2} = [0 1 2];
    smax = 2;
end
comb = combwithhisum(smax, cellarray{:});
disp(comb)
fprintf('find %d/%d combinations\n', size(comb,1), prod(cellfun(@numel, cellarray)))
%%
function [comb, s] = combwithhisum(smax, varargin)
% function comb = combwithhisum(smax, A1, A2, ...)
% Find all combination of AI, A2, ... An integer arrays
% comb of size (m x n) such that sum(comb,2) <= smax (scalar).
% NOTE: comb is row-sorted by their sums
comb = [];
s = [];
if ~isempty(varargin)
    c1 = reshape(varargin{1}, 1, []);
    c1 = c1(c1 <= smax);
    if isempty(c1)
        return
    end
    if length(varargin) > 1
        smax = smax-double(c1);
        [c, s] = combwithhisum(max(smax), varargin{2:end});
        if ~isempty(c)
            last = discretize(smax, [s; Inf]);
            for k = 1:length(c1)
                m = last(k);
                if m>=1
                    v = repelem(c1(k),m,1);
                    comb = vertcat(comb, [v, c(1:m,:)]); %#ok
                end
            end
        end
    else
        comb = reshape(c1, [], 1);
    end
    [s, is] = sort(sum(comb,2));
    comb = comb(is,:);
end
end % combwithhisum

Bruno Luong le 17 Sep 2022

Modifié(e) : Bruno Luong le 17 Sep 2022

Ouvrir dans MATLAB Online

One more version, remove for-loop and growing array. It is about 10% faster acoording to my test

%%
function [comb, s] = combwithhisum(smax, varargin)
% function comb = combwithhisum(smax, A1, A2, ...)
% Find all combination of AI, A2, ... An integer arrays
% comb of size (m x n) such that sum(comb,2) <= smax (scalar).
% NOTE: comb is row-sorted by their sums
comb = [];
s = [];
if ~isempty(varargin)
    c1 = reshape(varargin{1}, [], 1);
    if length(varargin) > 1
        smax = smax-double(c1);
        maxsmax = max(smax);
        if maxsmax >= 0
            [c, s] = combwithhisum(maxsmax, varargin{2:end});
            if ~isempty(c)
                last = discretize(smax, [s; Inf]);
                k = last > 0;
                last = last(k);
                if ~isempty(last)
                    i = cumsum(last);
                    r = ones(1, i(end));
                    r(1+i(1:end-1)) = 1-last(1:end-1);
                    r = cumsum(r);
                    comb = [repelem(c1(k), last, 1), c(r,:)];
                end
            end
        end
    else
        comb = c1;
    end
    [s, is] = sort(sum(comb,2));
    comb = comb(is,:);
end
end % combwithhisum

Fabrício le 17 Sep 2022

Modifié(e) : Fabrício le 19 Sep 2022

This approach is super fast! One of the best pros is that it basicaly doesn't matter how big each vector from the cellarray is, it takes the same run time. If the input smax is limiting enough you could run a really large cellarray with it. Thank you Bruno!

*Update: Just tested in my project and it works really well with the inputs I have. Since I have a very limiting smax, even if I keep my data as double it barely uses any RAM. Just one thing to keep in mind, if someone else is going to use Bruno's function, don't forget to apply smax limit for situations when the cellarray has only 1 vector.

Connectez-vous pour commenter.

Answer 2

Jan le 16 Sep 2022

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1806800-generating-a-combination-matrix-within-a-certain-condition#answer_1054965

Modifié(e) : Jan le 16 Sep 2022

Ouvrir dans MATLAB Online

It is easy to modify the code of a copy of Matlab's combvec function, which uses the class of the input: Change the zeros(., .) commands to zeros(., ., class(m)).

You can use a much faster replacement of combvec which considers the class of the inputs and allocates the complete output in one step:

c = cell(1, 7);
c(:) = {uint8(1:10)};
tic;
r1 = combvec(c{:});
toc
Elapsed time is 1.407195 seconds.
tic;
r2 = myCombVec(c{:});
toc
Elapsed time is 0.070431 seconds.
isequal(r1, r2)
ans = logical
   1
function y = myCombVec(varargin)
% (C) 2022, Jan, Heidelberg, CC BY-SA 3.0
m = nargin;
n = cellfun('prodofsize', varargin);
a = 1;
b = prod(n);
y = zeros(m, b, class(varargin{1}));
for k = 1:m
    x = varargin{k};
    b = b / n(k);
    y(k, :) = repmat(repelem(x(:).', 1, a), 1, b);
    a = a * n(k);
end
end

2 commentaires
Afficher AucuneMasquer Aucune

Fabrício le 16 Sep 2022

This really improved how far I could go with the amount of combinations and solved the class problem, now using 8 times less memory, but I wonder if there is a way to apply the condition in this... The range of combinations I need is so much smaller than the whole matrix.

But your answer helped me a lot! Thank you Jan.

Jan le 17 Sep 2022

Ouvrir dans MATLAB Online

This is a naive approach, which creates all combinations and stores only the the matching ones:

function Y = myCombVecLim(Lim, varargin)
% (C) 2022, Jan, Heidelberg, CC BY-SA 3.0
C    = varargin;
nC   = numel(C);
lenC = cellfun('prodofsize', C);
step = 1e4;  % Let the array grow in fair steps
nY   = prod(lenC);
iY   = 0;    % Current column of output
wY   = min(step, nY);
Y    = zeros(nC, wY, class(C{1}));  % Transposed at first
v   = cellfun(@(x) x(1), C, 'UniformOutput', 1);
idx = ones(nC, 1);  % idx(p) is current index of C{p}
for k = 1:nY
   if sum(v) < Lim         % Store output for feasible column
      iY = iY + 1;         % Proceed to next column of Y
      if iY > wY           % Increase Y in steps
         wY       = wY + step;
         Y(1, wY) = 0;     % Pre-allocates block with zeros
      end
      Y(:, iY) = v;        % Copy v to Y
   end
   
   for p = 1:nC            % Get next combination
      if idx(p) < lenC(p)  % Next element of C{p}
         idx(p) = idx(p) + 1;
         v(p)   = C{p}(idx(p));
         break;            % Stop for p loop
      else                 % Last element of C{p}, reset to 1:
         idx(p) = 1;
         v(p)   = C{p}(1);
      end
   end
end
Y = Y(:, 1:iY);            % Crop unused elements
end

Bruno's method is much faster.

Connectez-vous pour commenter.

Answer 3

Jan le 17 Sep 2022

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1806800-generating-a-combination-matrix-within-a-certain-condition#answer_1055365

Modifié(e) : Jan le 17 Sep 2022

Ouvrir dans MATLAB Online

After some test I could simplify the original combvec and including the limit is easy also:

% Without limit, but considering the type of the input:  ******************
function y = CombVecFast(varargin)
% (C) 2022, Jan, Heidelberg, CC BY-SA 3.0
if nargin == 0
    y = [];
    return;
end
y = varargin{1};
for k = 2:nargin
    z = varargin{k};
    y = [repmat(y, 1, size(z, 2)); repelem(z, 1, size(y, 2))];
end
end
% With limit: *******************************************************
function y = CombVecLim(Lim, varargin)
% (C) 2022, Jan, Heidelberg, CC BY-SA 3.0
if nargin < 2
   y = [];
   return;
end
y = varargin{1};
% Care for saturated integers:
if ~isfloat(y) && Lim < intmax(class(y))
   sumType = 'native';
else
   sumType = 'double';
end
for k = 2:nargin - 1
    z = varargin{k};
    y = [repmat(y, 1, size(z, 2)); repelem(z, 1, size(y, 2))];
    y = y(:, sum(y, 1, sumType) < Lim);
end
end

The speed can compete with Bruno's implementation.

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Bruno Luong le 18 Sep 2022

"This removes a small number of columns only"

I see, it remove almost nothing. The mean of 1:3+randi(3) is 3. sum of 10 those must have mean of 30 of standard deviation not bery large. The smax = 50 is too hight and have very little effect.

Jan le 18 Sep 2022

Ouvrir dans MATLAB Online

@Bruno Luong: Exactly: If none or almost none columns are removed, the smart method to avoid to produce them loses its power.

On the other hand the speed depends on the input data:

cellarray = cell(1,10);
for k=1:length(cellarray)
    % Your data: cellarray{k} = randi(10,1,3+randi(3),'uint8');
    cellarray{k} = uint8(1:3+randi(3));
end
smax = 30;

With these data CombVecLim needs about 60% of the time used by combwithhisum2, which is slighly faster than combwithhisum3.

But for

cellarray{k} = randi(10,1,3+randi(20),'uint8');
smax = 20;

combwithhisum3 is 4 times faster than CombVecLim3.

CombVecLim3 is an improved version, which avoids to compute the sum repeatedly, but uses the result of the former iteration. In addition it does not create the full matrix y to remove the unwanted limits afterwards, but collects the wanted columns only:

function y = CombVecLim3(Lim, varargin)
if nargin < 2
   y = [];
   return;
end
y = varargin{1};
s = y;                             % Current sum
if ~isfloat(y) && Lim >= intmax(class(y))
   s   = double(y);                % Care for saturation of integers
   Lim = double(Lim);
end
for k = 2:nargin - 1
   z = varargin{k};                % Next vector from inputs
   a = repmat( y, 1, width(z));    % Multiple blocks of existing data
   b = repelem(z, 1, width(y));    % Repeated elements of this vector
   
   s = repmat(s, 1, width(z)) + b; % Sum over all columns
   m = s < Lim;                    % Mask for accepted columns
   if all(m)                       % Include all columns
      y = [a; b];
   else                            % Include masked columns only
      s = s(m);
      y = [a(:, m); b(m)];
   end
end
end

Of course, this is less charming than the compact former version.

According to my measurements under R2018b your combwithhisum3 is twice as fast for

cellarray{k} = randi(10,1,3+randi(10),'uint8');
smax = 20;

and has the same speed for:

cellarray{k} = randi(10,1,3+randi(5),'uint8');
smax = 40;

One problem remains: For UINT8 input and Lim=255 the criterion sum(y,1)<Lim is fragile due to the saturation of integers.

I really like your smart and elegant solution. Reading it is like contemplating classic art. And I had some fun and insights with providing the naive approach myCombVecLim and the dull one CombVecLim, which is even fast for some kinds of inputs. The OP has some methods to check, which one works with the data at best. And all provided solutions are much faster than Matlab's combvec. I'll write an enhancement request.

This thread became a useful tutorial about how to solve a combination problem with limited column sums and how to compare the different approachs.

Connectez-vous pour commenter.

Generating a combination matrix within a certain condition

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (2)

2 commentaires
Afficher AucuneMasquer Aucune

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

Generating a combination matrix within a certain condition

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Plus de réponses (2)

2 commentaires Afficher AucuneMasquer Aucune

4 commentaires Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens

2 commentaires
Afficher AucuneMasquer Aucune

4 commentaires
Afficher 2 commentaires plus anciensMasquer 2 commentaires plus anciens