Summing for loop speed-up by multiplication with unity.

2 vues (au cours des 30 derniers jours)
Zaphod
Zaphod le 1 Déc 2017
Commenté : Jan le 2 Déc 2017
Hi
I was benchmarking the speed of for loops versus proper reshaped-matrix-multiplication in the context of tensor multiplication. Essentially I wanted to calculate a third order tensor fo_{k,l,m} defined as fo_{k,l,m}=sum_{n=1}^{nmax}pt_{k,n}*ft_{n,l,m} from two given objects pt and ft. As expected I found, that implementing this via reshaping and matrix multiplication is much faster compared with discrete for loops.
However, I found some odd behaviour in the speed of the discrete for loop implementation. On my machine this code takes 18secs:
nk=115;
nl=116;
nm=117;
ft=rand([nk,nl,nm]);
fo = zeros(nk,nl,nm);
pt=rand(nk);
tic
for cntk = 1 : nk
for cntl = 1:nl
for cntm = 1: nm
for cntsum = 1: nk
fo(cntk,cntl,cntm)=fo(cntk,cntl,cntm)+pt(cntk,cntsum)*ft(cntsum,cntl,cntm);
end
end
end
end
toc
Adding a factor of one in the innermost part of the loop brings the time down to only 4 secs:
nk=115;
nl=116;
nm=117;
ft=rand([nk,nl,nm]);
fo = zeros(nk,nl,nm);
pt=rand(nk);
tic
for cntk = 1 : nk
for cntl = 1:nl
for cntm = 1: nm
for cntsum = 1: nk
fo(cntk,cntl,cntm)=1*fo(cntk,cntl,cntm)+pt(cntk,cntsum)*ft(cntsum,cntl,cntm);
end
end
end
end
toc
Curiosity made me check the situation with non-nested loops with a scalar / vector. Here the speed-up is opposite. This takes .45 secs
a=0
ntest=100000000;
b=rand(1,ntest);
tic
for cnt = 1 : ntest
a = a + b(ntest);
end
toc
And this takes .75 secs:
a=0
ntest=100000000;
b=rand(1,ntest);
tic
for cnt = 1 : ntest
a = 1*a + b(ntest);
end
toc
I wonder what MATLAB (ver. 2015a) is doing differently during the execution of the two versions. Any ideas?
Kind regards
Zaph

Réponses (1)

Jan
Jan le 2 Déc 2017
Modifié(e) : Jan le 2 Déc 2017
My timings under R2015b/64/Win7:
Elapsed time is 16.105212 seconds. % No "1*"
Elapsed time is 16.074966 seconds. % With "1*"
Elapsed time is 0.894157 seconds. % Faster method below
My timings under R2016b/64/Win7:
Elapsed time is 4.882229 seconds. % No "1*"
Elapsed time is 4.767648 seconds. % With "1*"
Elapsed time is 1.163444 seconds. % Faster method below
Obviously JIT acceleration has a strong effect on the runtime. It seems, like in your R2015a the JIT profits from the multiplication by 1 - for unknown reasons. The JIT is not documented and we could only speculate what's going on.
It is nice, that the naive loop runs 3 times faster in R2016b, but what a pity that the faster version gets 25% slower:
for cntk = 1 : nk
ptv = pt(cntk, :);
for cntl = 1:nl
fo(cntk, cntl, :) = ptv * reshape(ft(:, cntl, :), nk, nm);
end
end
  2 commentaires
Zaphod
Zaphod le 2 Déc 2017
Modifié(e) : Zaphod le 2 Déc 2017
Hi Jan,
Thanks for picking up on this one. In terms of best speed I'm currently working with this line:
fo=reshape(pt*reshape(ft,[nk,nm*nl]),[nk,nl,nm]);
It's doing the trick in a few milliseconds (2015a,MacBook).
Your results with different MATLAB versions are very interesting and make me even more curious.
Thanks and have great day,
Zaph
Jan
Jan le 2 Déc 2017
fo=reshape(pt*reshape(ft,[nk,nm*nl]),[nk,nl,nm]);
This takes 0.1 sec on my machine.

Connectez-vous pour commenter.

Catégories

En savoir plus sur MATLAB dans Help Center et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by