push
How to accelerate multiple Backslash Operations?
38 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi there,
I need to solve 500000 Linear Systems, so 500000 times A/b and I wonder whats the fastest way to do this as I have to do this quite a few times.
The Dimensions are A: 15*15*500000 and b: 15*1*500000, so rather small. A is dense.
is the fastest option really just a for-Loop like this one?:
for i=1:50000
C(:,i) = A(:,:,i) \ b(:,i)
end
What have I tried so far:
Because i have to run the calculations with different b's so its not such a big issue to calculate the Inverses (Ainv) of A once, I used the multiprod-Package, which vectorizes Matrix-Vector Multiplications, so
C = multiprod(Ainv, b)
is around 6xfaster than
for i=1:50000
C(:,i) = Ainv(:,:,i) * b(:,i)
end
I think the mmx Package works in a similar way.
But it is very often said that the BackslashOperator should be preferred over the calculation with the Inverse, so Im not very comfortable to use that solution.
Maybe multithreading is a solution, but I dont have any experience with that.
Does somebody know, whether it is also possible to vectorize the Backslash operator, or whether there is another way to speed things up?
I hope, I expressed myself properly and didnt foget or overlook something major :)
Any help would be much appreciated.
Julian
Réponse acceptée
Matt J
le 25 Juin 2019
If you have the Parallel Computing Toolbox, this is readily done on the GPU
C=pagefun(@mldivide,gpuArray(A), gpuArray(b));
5 commentaires
Bruno Luong
le 5 Juil 2019
Each individual gpu operation requires data transfert between CPU-host and GPU. If you have small stuffs the overhead transfert time will kill any advantage you can get from GPU.
Matt J
le 5 Juil 2019
Modifié(e) : Matt J
le 5 Juil 2019
@Jules, Why are you still using a loop? The whole job should have been done in just the single call to the pagefun command. On my GPU (GeForce GTX 1080 Ti), the whole calculation takes 0.1 sec.
gd=gpuDevice;
A=gpuArray.rand(15,15,500000);
b = gpuArray.rand(15,1,500000);
tic;
C=pagefun(@mldivide, A, b);
wait(gd);
toc;%Elapsed time is 0.095832 seconds.
Plus de réponses (1)
Bruno Luong
le 23 Juin 2019
You might try this FEX
2 commentaires
Bruno Luong
le 26 Juin 2019
On the test on my PC it can reduce MATLAB for-loop time by a factor of 2 or 3 for 15x15 matrices
size(A) = [15 15 10000]
size(y) = [15 1 10000]
MultipleQRSolve time = 1.21478 [s]
Matlab loop time = 0.656033 [s]
SliceMultiSolver time = 0.218691 [s]
The test code is here
nA = 15;
mA = 15;
nY = 1;
nP = 10000;
szA = [nA,mA,nP];
szY = [nA,nY,nP];
A = randn(szA)+1i*randn(szA);
y = randn(szY)+1i*randn(szY);
tic
% https://fr.mathworks.com/matlabcentral/fileexchange/68976-multipleqr
x1 = MultipleQRSolve(A,y);
t1=toc;
tic
x2 = zeros(size(x1));
for k=1:nP
x2(:,:,k) = A(:,:,k)\y(:,:,k);
end
t2=toc;
tic
% https://fr.mathworks.com/matlabcentral/fileexchange/24260-multiple-same-size-linear-solver
x3 = SliceMultiSolver(A,y);
t3=toc;
fprintf('size(A) = %s\n', mat2str(size(A)));
fprintf('size(y) = %s\n', mat2str(size(y)));
fprintf('MultipleQRSolve time = %g [s]\n', t1);
fprintf('Matlab loop time = %g [s]\n', t2);
if exist('t3','var')
fprintf('SliceMultiSolver time = %g [s]\n', t3);
end
Voir également
Catégories
En savoir plus sur Logical dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!