The same codes, the score is from 5 to 49, too wide.
score.p is invalid now, the score is size, not speed.
The function conv is not slow. It is like complaining that the function sum is slow. And the only way the function sum can be slow is if we are not using it properly.
The only way that this solution can possibly make sense is that the cache is big enough to handle the entire matrix, and breaking into little pieces (using conv) make it slower. However that is only true for data that fit in cache. Convolution is almost a basic arithmetic operation. And as such, the only way to make it truly faster is via hardware.
well, if I recall correctly when using FFT-based convolution the number of operations scales proportional to N*log(N) with the length of the input vectors, while the number of operations in a standard convolution scales proportional to N^2, so for sufficiently large input vectors this code will always be faster than a call to the equivalent conv function, no additional hardware required
Yes, I understand how this can be possible for the current test suite (and I've forgotten about FFT asymptotic speed). However, convolution complexity is not necessarily O(N^2). This only happens when one of the signals has almost the same order of the other. For instance, when signal_1 has the greatest length, and length(signal_2) >= log(length(signal_1)). Otherwise, regular convolution will be faster than O(N log(N)) when it is done in a way to exploit memory availability (avoiding cache misses). Pretty much in the way Gimp does tiling https://docs.gimp.org/2.10/en/gimp-using-setup-tile-cache.html and other software https://software.intel.com/content/www/us/en/develop/articles/efficient-use-of-tiling.html. This made me think that there is probably a way to do convolution faster without FFT.
Btw, any doubter may try what I've said themselves by doing conv(rand(1,5e7),1:9) and the fft method for these two signals. Please, do compare their speed. FFT will be way slower in this case.
PS: 1:9 is the standard size for many filters/kernel used in image processing, while signal processing may use even smaller filters/kernels. It is not an unfair/outlier case.
Find the sum of all the numbers of the input vector
34736 Solvers
275 Solvers
225 Solvers
find the surface area of a cube
285 Solvers
425 Solvers