Summing for loop speed-up by multiplication with unity.
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Hi
I was benchmarking the speed of for loops versus proper reshaped-matrix-multiplication in the context of tensor multiplication. Essentially I wanted to calculate a third order tensor fo_{k,l,m} defined as fo_{k,l,m}=sum_{n=1}^{nmax}pt_{k,n}*ft_{n,l,m} from two given objects pt and ft. As expected I found, that implementing this via reshaping and matrix multiplication is much faster compared with discrete for loops.
However, I found some odd behaviour in the speed of the discrete for loop implementation. On my machine this code takes 18secs:
nk=115;
nl=116;
nm=117;
ft=rand([nk,nl,nm]);
fo = zeros(nk,nl,nm);
pt=rand(nk);
tic
for cntk = 1 : nk
for cntl = 1:nl
for cntm = 1: nm
for cntsum = 1: nk
fo(cntk,cntl,cntm)=fo(cntk,cntl,cntm)+pt(cntk,cntsum)*ft(cntsum,cntl,cntm);
end
end
end
end
toc
Adding a factor of one in the innermost part of the loop brings the time down to only 4 secs:
nk=115;
nl=116;
nm=117;
ft=rand([nk,nl,nm]);
fo = zeros(nk,nl,nm);
pt=rand(nk);
tic
for cntk = 1 : nk
for cntl = 1:nl
for cntm = 1: nm
for cntsum = 1: nk
fo(cntk,cntl,cntm)=1*fo(cntk,cntl,cntm)+pt(cntk,cntsum)*ft(cntsum,cntl,cntm);
end
end
end
end
toc
Curiosity made me check the situation with non-nested loops with a scalar / vector. Here the speed-up is opposite. This takes .45 secs
a=0
ntest=100000000;
b=rand(1,ntest);
tic
for cnt = 1 : ntest
a = a + b(ntest);
end
toc
And this takes .75 secs:
a=0
ntest=100000000;
b=rand(1,ntest);
tic
for cnt = 1 : ntest
a = 1*a + b(ntest);
end
toc
I wonder what MATLAB (ver. 2015a) is doing differently during the execution of the two versions. Any ideas?
Kind regards
Zaph
0 commentaires
Réponses (1)
Jan
le 2 Déc 2017
Modifié(e) : Jan
le 2 Déc 2017
My timings under R2015b/64/Win7:
Elapsed time is 16.105212 seconds. % No "1*"
Elapsed time is 16.074966 seconds. % With "1*"
Elapsed time is 0.894157 seconds. % Faster method below
My timings under R2016b/64/Win7:
Elapsed time is 4.882229 seconds. % No "1*"
Elapsed time is 4.767648 seconds. % With "1*"
Elapsed time is 1.163444 seconds. % Faster method below
Obviously JIT acceleration has a strong effect on the runtime. It seems, like in your R2015a the JIT profits from the multiplication by 1 - for unknown reasons. The JIT is not documented and we could only speculate what's going on.
It is nice, that the naive loop runs 3 times faster in R2016b, but what a pity that the faster version gets 25% slower:
for cntk = 1 : nk
ptv = pt(cntk, :);
for cntl = 1:nl
fo(cntk, cntl, :) = ptv * reshape(ft(:, cntl, :), nk, nm);
end
end
2 commentaires
Jan
le 2 Déc 2017
fo=reshape(pt*reshape(ft,[nk,nm*nl]),[nk,nl,nm]);
This takes 0.1 sec on my machine.
Voir également
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!