Memory cost of multiplying sparse matrices

Question

2 votes

What is the memory cost for multiplying sparse matrices? It seems to be much larger than the memory used by either of the matrices being multiplied:

>> A = sprand(5e9,2, 1e-7); B = sparse(eye(2));
whos
  Name               Size            Bytes  Class     Attributes
  A         5000000000x2             16024  double    sparse    
  B                  2x2                56  double    sparse    
  
>> A*B;
Error using  * 
Out of memory. Type HELP MEMORY for your options.

As you can see in the example above, the sparse matrices A and B are not taking up much memory, but computing A*B still results in an out of memory error. Why does this happen, and is there a way to avoid it?

8 commentaires
Afficher 6 commentaires plus anciens Masquer 6 commentaires plus anciens

AS le 18 Sep 2020

I'm using R2018a on a 16GB machine. I don't seem to see a spike in memory usage when trying a a slightly smaller size than the one causing an eror, but the computation is so fast that I don't think htop or a task manager would pick it up.

Bruno Luong le 18 Sep 2020

Modifié(e) : Bruno Luong le 18 Sep 2020

Ouvrir dans MATLAB Online

Agress that task manager could miss it. I don't see any spike on my 32 Gb PC while

AB = A*B

is being carried out sous MATLAB

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Matt J le 15 Sep 2020

Modifié(e) : Matt J le 15 Sep 2020

Ouvrir dans MATLAB Online

0 votes

I believe it is simply because Matlab sparse matrix routines don't handle very tall & thin matrix dimensions very well. It becomes much faster and less memory-consuming if you reshape A to have a few orders of magnitude fewer rows, and do the following equivalent computation:

 A = sprand(5e9,2, 1e-7); B = speye(2);
 
 %%Begin workaround
 
 Ar=reshape(A,[],1000);
 Br=kron(B, speye(500));
 
 result= reshape(Ar*Br,size(A));

9 commentaires
Afficher 7 commentaires plus anciens Masquer 7 commentaires plus anciens

AS le 15 Sep 2020

Modifié(e) : AS le 15 Sep 2020

Ouvrir dans MATLAB Online

The fact that the matrices are so tall is not really related to my previous question, but is just a property of the problem I'm trying to solve. I am trying to do tensor-contractions with sparse tensors.

When doing tensor contractions with multi-dimensional arrays, each contraction is done by reshaping the involved tensors pairwise to matrices such that the contraction between then can be done as an ordinary matrix-multiplication.

For example, consider the tensor contraction

,

where A is a (R M, N) tensor, B is a (M, P) tensor, and C is a (N, Q) tensor. The result X will be a (R, P, Q) tensor, which is obtained as folows:

% permute 2nd and 3rd dimensions of A
Aperm = permute(A, [1 3 2]);
% reshape Aperm to a (R*N, M) array
Areshp = reshapeme(Aperm, [R*N, M]);
% perform contractions between A and B as a matrix product
AB = Areshp * B;
% reshape AB to the right dimensions
AB = reshape(AB, [R,N,P]);
% permute 2nd and 3rd dimensions of AB to get dimension represented by index beta at the back again
ABperm = permute(AB, [1 3 2]);
% reshape ABperm to a (R*P, N) array
ABreshp = reshape(ABperm, [R*P, N]);
%perform contraction between AB and C as a matrix product
ABC = ABreshp * C;
% reshape to right dimensions to get final result
X = reshape(ABC, [R, P, Q]);

There is a fancy function called ncon to do such tensor contractions for ordinary arrays, it is explained this article, and the code is in the sourse of the prepring.

I am trying to do this in general, while working with very large, but very sparse, tensors. How I store these tensors is not relevant here, as I reshape them to matrices of the right dimensions before doing the matrix products (with some trickery to avoid ever constructing matrices with an extremely high number of columns).

Matt J le 16 Sep 2020

Never mind. I assume your sparse 3D tensors never actually exist in 3D form anyway, right? Internally, you would have to carry them around as reshaped 2D sparse arrays, because that is the only sparse form that Matlab supports.

AS le 16 Sep 2020

Indeed, I actually store them as a 1D sparse array. I only ever need to reshape these to the appropriate 2D arrays when I need to do some tensor contraction.

Connectez-vous pour commenter.

Answer 2

Bruno Luong le 15 Sep 2020

Modifié(e) : Bruno Luong le 15 Sep 2020

Ouvrir dans MATLAB Online

0 votes

I guess MATLAB creates a temporary buffer of length equals to the number of rows of A when A*B is invoked. The exact detail only TMW employees who can acces to the source code can answer.

Here is what I suggest to multiply A*B for very long tall A

[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
    [jB, ~, b] = find(B(:,k));
    [i, l] = ismember(jA,jB);
    q = nnz(i);
    idx = p+(1:q);
    I(idx) = iA(i);
    J(idx) = k;
    S(idx) = a(i).*b(l(i));
    p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Bruno Luong le 15 Sep 2020

Modifié(e) : Bruno Luong le 15 Sep 2020

Ouvrir dans MATLAB Online

A variant

[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
    Bk = B(:,k);
    jB = find(Bk);
    i = ismembc(jA,jB); % undocumented stock function, too bad it's doesn't return second argument of ISMEMBER 
    q = nnz(i);
    idx = p+(1:q);
    I(idx) = iA(i);
    J(idx) = k;
    S(idx) = a(i).*Bk(jA(i));
    p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);

It doesn't seem to be faster than the first method when I test with tic/toc, but the tests I conducted are far from cover all the cases.

Connectez-vous pour commenter.

Answer 3

Matt J le 16 Sep 2020

Modifié(e) : Matt J le 16 Sep 2020

Ouvrir dans MATLAB Online

0 votes

Here's another customized multiplication routine for tall A. I do not know how it compares to Bruno's in terms of speed, but it is loop-free.

 A = sprand(5e9,2, 1e-7); B = speye(2);
 tic
 
     m=size(A,1);
     n=size(B,2);
 
     Ia=find(any(A,2));
     Jb=find(any(B,1));
 
 C=A(Ia,:)*B(:,Jb);
 
     [Ic,Jc,S]=find(C);
 
 AB=sparse( Ia(Ic) , Jb(Jc) , S , m,n);    %equal to A*B
     
 toc%Elapsed time is 0.001254 seconds.

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Memory cost of multiplying sparse matrices

8 commentaires
Afficher 6 commentaires plus anciens Masquer 6 commentaires plus anciens

Réponse acceptée

9 commentaires
Afficher 7 commentaires plus anciens Masquer 7 commentaires plus anciens

Plus de réponses (2)

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Catégories

Produits

Version

Tags

Community Treasure Hunt

Memory cost of multiplying sparse matrices

8 commentaires Afficher 6 commentaires plus anciens Masquer 6 commentaires plus anciens

Réponse acceptée

9 commentaires Afficher 7 commentaires plus anciens Masquer 7 commentaires plus anciens

Plus de réponses (2)

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Catégories

Produits

Version

Tags

Voir également

Community Treasure Hunt

8 commentaires
Afficher 6 commentaires plus anciens Masquer 6 commentaires plus anciens

9 commentaires
Afficher 7 commentaires plus anciens Masquer 7 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens