[Fixed-Point Toolbox] How to speed up fi object matrix multiplication?

5 vues (au cours des 30 derniers jours)
Starscream123
Starscream123 le 29 Août 2018
In my current project I need to multiply matrices with fi object entries quite a lot. Turns out that this type of multiplication is significantly slower in MATLAB compared to matrix multiplication with double entries. As an example consider the following code snippet:
tic;
N = 200;
A = fi(rand(N,N),1,32,16);
B = fi(rand(N,N),1,32,16);
C = A*B;
t=toc;
This takes about 13 seconds for two 200x200 matrices to multiply. Profiling the code shows that almost all of the time is spent in line 25 of the mtimes.m file of the fixed-point toolbox. That line is:
c = fimtimes(a,b);
Debugger doesn't allow me to step into the fimtimes routine. Is it possible to accelerate these types of computations by using mex files or some other method? Or Is my only forward writing my own conversion function that emulates fixed-point arithmetic with MATLAB's native floating point support? Thanks. This was checked on R2017b and R2016a.

Réponses (2)

John D'Errico
John D'Errico le 30 Août 2018
Modifié(e) : John D'Errico le 30 Août 2018
Why would you be even remotely surprised that the code runs more slowly than a highly optimized code that works on doubles? A matrix multiply in double or single precision can use the BLAS to do the work, routines that are highly optimized, and can essentially use multiple threads to do the work as needed on their own. The multiples and adds necessary are done in a low level call that flies like blazes.
But what happens when you try to do this with fi objects? The fi matrix multiply instead surely is written using simple loops, working on a data type that is not as trivially multiplied or added in hardware. So it is slow as molasses, at least relatively so. Remember that a 200x200 matrix*matrix multiply involves 200^3=8000000 scalar multiples, and a similar amount of additions.
8e6/13
ans =
615384.615384615
So roughly 0.6 megafilops (fixed-point operations) per second. That seems reasonably fast to me.
So, can you write this yourself in MATLAB, or perhaps writing c code?
Yeah, right. You may be dreaming. Don't forget that to work in a fixed point arithmetic, you need to do a multiply, but then make sure that you CAREFULLY and CORRECTLY round the result.
Be VERY careful here, because floating point numbers CANNOT represent a number like 2.37 exactly as a double. So IF you try to do this task, you will find that your computations will be inconsistent, as compared to what the fi multiples and ares return.
So, can you do it? Well, a lot of that depends on how good your coding skills are. Are they good enough that you can write code that out-performs code written by someone who does know what they are doing? Is your understanding of computer arithmetic good enough that you can do this yourself, writing more highly optimized code than what already exists? My guess is if you needed to ask this question, then it probably is not.
Lets see, if a 200x200 matrix*matrix multiply requires 8e6 multiplies, and roughly 8e6 adds, currently taking 13 seconds to do the work, then you need to do one multiply and an addition (a flop) in significantly less than 1e-6 seconds.
13/8e6
ans =
1.625e-06
Don't forget about function call overhead, passing data around, etc. So unless hand written code by you can accurately and efficiently beat that speed, then you would be wasting your time.
As far as a direct compilation into MEX from the MATLAB code, don't expect that this will be faster. That rarely offers a gain in speed, and may well be slower.
Could you write code that uses parallel computing to do the work? Well, IF you have a parallel computing TB license, and you have sufficient skills to do the work, and you have sufficient CPUs available to do the work, perhaps. I would first check to see if the fi toolbox matrix multiply is multi-threaded when it does the call. Sometimes these things already use multiple threads automatically.
It is trivially easy to formulate a problem that you want to run quickly. We get spoiled by computers that are so fast and powerful.

Andy Bartlett
Andy Bartlett le 1 Sep 2020
Making simulation two orders of magnitude faster with fiaccel
If the fixed-point code you are simulating is code generation compatible, then you can speed up the simulation using
For example, the two attached files are a modified versions of the code in Starscream123 original question,
and a test harness.
Running the test harness outputs the following timing results on my computer.
>> test_fiMatMul
time_Elapsed_Interpreted_Mode_Sim =
18.2188
time_Elapsed_fiaccel_Mode_Sim =
0.1301
timeReductionPercent =
99.2858
speedUpRatio =
140.0134
Notice that simulation using fiaccel was 140X times faster than using interpreted mode.
Specific elapsed times and speed up ratios will depend on the algorithm and the computer(s) used.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by