How to accelerate multiple Backslash Operations?

38 vues (au cours des 30 derniers jours)
JulesVerne
JulesVerne le 20 Juin 2019
Modifié(e) : Matt J le 5 Juil 2019
Hi there,
I need to solve 500000 Linear Systems, so 500000 times A/b and I wonder whats the fastest way to do this as I have to do this quite a few times.
The Dimensions are A: 15*15*500000 and b: 15*1*500000, so rather small. A is dense.
is the fastest option really just a for-Loop like this one?:
for i=1:50000
C(:,i) = A(:,:,i) \ b(:,i)
end
What have I tried so far:
Because i have to run the calculations with different b's so its not such a big issue to calculate the Inverses (Ainv) of A once, I used the multiprod-Package, which vectorizes Matrix-Vector Multiplications, so
C = multiprod(Ainv, b)
is around 6xfaster than
for i=1:50000
C(:,i) = Ainv(:,:,i) * b(:,i)
end
I think the mmx Package works in a similar way.
But it is very often said that the BackslashOperator should be preferred over the calculation with the Inverse, so Im not very comfortable to use that solution.
Maybe multithreading is a solution, but I dont have any experience with that.
Does somebody know, whether it is also possible to vectorize the Backslash operator, or whether there is another way to speed things up?
I hope, I expressed myself properly and didnt foget or overlook something major :)
Any help would be much appreciated.
Julian

Réponse acceptée

Matt J
Matt J le 25 Juin 2019
If you have the Parallel Computing Toolbox, this is readily done on the GPU
C=pagefun(@mldivide,gpuArray(A), gpuArray(b));
  5 commentaires
Bruno Luong
Bruno Luong le 5 Juil 2019
Each individual gpu operation requires data transfert between CPU-host and GPU. If you have small stuffs the overhead transfert time will kill any advantage you can get from GPU.
Matt J
Matt J le 5 Juil 2019
Modifié(e) : Matt J le 5 Juil 2019
@Jules, Why are you still using a loop? The whole job should have been done in just the single call to the pagefun command. On my GPU (GeForce GTX 1080 Ti), the whole calculation takes 0.1 sec.
gd=gpuDevice;
A=gpuArray.rand(15,15,500000);
b = gpuArray.rand(15,1,500000);
tic;
C=pagefun(@mldivide, A, b);
wait(gd);
toc;%Elapsed time is 0.095832 seconds.

Connectez-vous pour commenter.

Plus de réponses (1)

Bruno Luong
Bruno Luong le 23 Juin 2019
You might try this FEX
  2 commentaires
JulesVerne
JulesVerne le 24 Juin 2019
Thanks, Ill take a look at it
Bruno Luong
Bruno Luong le 26 Juin 2019
On the test on my PC it can reduce MATLAB for-loop time by a factor of 2 or 3 for 15x15 matrices
size(A) = [15 15 10000]
size(y) = [15 1 10000]
MultipleQRSolve time = 1.21478 [s]
Matlab loop time = 0.656033 [s]
SliceMultiSolver time = 0.218691 [s]
The test code is here
nA = 15;
mA = 15;
nY = 1;
nP = 10000;
szA = [nA,mA,nP];
szY = [nA,nY,nP];
A = randn(szA)+1i*randn(szA);
y = randn(szY)+1i*randn(szY);
tic
% https://fr.mathworks.com/matlabcentral/fileexchange/68976-multipleqr
x1 = MultipleQRSolve(A,y);
t1=toc;
tic
x2 = zeros(size(x1));
for k=1:nP
x2(:,:,k) = A(:,:,k)\y(:,:,k);
end
t2=toc;
tic
% https://fr.mathworks.com/matlabcentral/fileexchange/24260-multiple-same-size-linear-solver
x3 = SliceMultiSolver(A,y);
t3=toc;
fprintf('size(A) = %s\n', mat2str(size(A)));
fprintf('size(y) = %s\n', mat2str(size(y)));
fprintf('MultipleQRSolve time = %g [s]\n', t1);
fprintf('Matlab loop time = %g [s]\n', t2);
if exist('t3','var')
fprintf('SliceMultiSolver time = %g [s]\n', t3);
end

Connectez-vous pour commenter.

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by