MATLAB Answers

Memory cost of multiplying sparse matrices

10 views (last 30 days)
AS
AS on 14 Sep 2020
Edited: Bruno Luong on 18 Sep 2020
What is the memory cost for multiplying sparse matrices? It seems to be much larger than the memory used by either of the matrices being multiplied:
>> A = sprand(5e9,2, 1e-7); B = sparse(eye(2));
whos
Name Size Bytes Class Attributes
A 5000000000x2 16024 double sparse
B 2x2 56 double sparse
>> A*B;
Error using *
Out of memory. Type HELP MEMORY for your options.
As you can see in the example above, the sparse matrices A and B are not taking up much memory, but computing A*B still results in an out of memory error. Why does this happen, and is there a way to avoid it?

  8 Comments

Show 5 older comments
Matt J
Matt J on 18 Sep 2020
But even on your 32 GB machine, does your Task Manager show a spike in RAM usage when the operation is performed? Mine doesn't.
AS
AS on 18 Sep 2020
I'm using R2018a on a 16GB machine. I don't seem to see a spike in memory usage when trying a a slightly smaller size than the one causing an eror, but the computation is so fast that I don't think htop or a task manager would pick it up.
Bruno Luong
Bruno Luong on 18 Sep 2020
Agress that task manager could miss it. I don't see any spike on my 32 Gb PC while
AB = A*B
is being carried out sous MATLAB

Sign in to comment.

Accepted Answer

Matt J
Matt J on 15 Sep 2020
Edited: Matt J on 15 Sep 2020
I believe it is simply because Matlab sparse matrix routines don't handle very tall & thin matrix dimensions very well. It becomes much faster and less memory-consuming if you reshape A to have a few orders of magnitude fewer rows, and do the following equivalent computation:
A = sprand(5e9,2, 1e-7); B = speye(2);
%%Begin workaround
Ar=reshape(A,[],1000);
Br=kron(B, speye(500));
result= reshape(Ar*Br,size(A));

  9 Comments

Show 6 older comments
AS
AS on 15 Sep 2020
About avoiding reshaping the sparse arrays, woudln't a bunch of forloops to do contractions element-wise be much much slower than the necessary permutes and reshapes?
As to what concerns the number of columns: I managed to avoid a high memory cost when many columns are required in the reshape by working with the transpose in such cases: using AT'*B where would be the transpose of Areshp in the example above. Things are done in a way that avoids ever actually making a sparse array with more than 1e9 colums.
Matt J
Matt J on 16 Sep 2020
Never mind. I assume your sparse 3D tensors never actually exist in 3D form anyway, right? Internally, you would have to carry them around as reshaped 2D sparse arrays, because that is the only sparse form that Matlab supports.
AS
AS on 16 Sep 2020
Indeed, I actually store them as a 1D sparse array. I only ever need to reshape these to the appropriate 2D arrays when I need to do some tensor contraction.

Sign in to comment.

More Answers (2)

Bruno Luong
Bruno Luong on 15 Sep 2020
Edited: Bruno Luong on 15 Sep 2020
I guess MATLAB creates a temporary buffer of length equals to the number of rows of A when A*B is invoked. The exact detail only TMW employees who can acces to the source code can answer.
Here is what I suggest to multiply A*B for very long tall A
[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
[jB, ~, b] = find(B(:,k));
[i, l] = ismember(jA,jB);
q = nnz(i);
idx = p+(1:q);
I(idx) = iA(i);
J(idx) = k;
S(idx) = a(i).*b(l(i));
p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);

  1 Comment

Bruno Luong
Bruno Luong on 15 Sep 2020
A variant
[iA, jA, a] = find(A);
m = size(A,1);
n = size(B,2);
p = numel(jA)*n; % Guess of size of I, J, S
% Preallocate
I = zeros(p,1,'uint32');
J = zeros(p,1,'uint32');
S = zeros(p,1);
p = 0;
for k=1:n
Bk = B(:,k);
jB = find(Bk);
i = ismembc(jA,jB); % undocumented stock function, too bad it's doesn't return second argument of ISMEMBER
q = nnz(i);
idx = p+(1:q);
I(idx) = iA(i);
J(idx) = k;
S(idx) = a(i).*Bk(jA(i));
p = p+q;
end
idx = 1:p;
AB = sparse(I(idx), J(idx), S(idx), m, n);
It doesn't seem to be faster than the first method when I test with tic/toc, but the tests I conducted are far from cover all the cases.

Sign in to comment.


Matt J
Matt J on 16 Sep 2020
Edited: Matt J on 16 Sep 2020
Here's another customized multiplication routine for tall A. I do not know how it compares to Bruno's in terms of speed, but it is loop-free.
A = sprand(5e9,2, 1e-7); B = speye(2);
tic
m=size(A,1);
n=size(B,2);
Ia=find(any(A,2));
Jb=find(any(B,1));
C=A(Ia,:)*B(:,Jb);
[Ic,Jc,S]=find(C);
AB=sparse( Ia(Ic) , Jb(Jc) , S , m,n); %equal to A*B
toc%Elapsed time is 0.001254 seconds.

  0 Comments

Sign in to comment.

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by