Code is running very slow, how to make faster?

Hello,
Please i am having issue with getting this code to run faster, i am using it for my final year dissertation. I will really appreciate your help on it if you can help me optimize the code to be faster.
The code is for Zero Normalized Cross Correlation which i want to use for template matching. I have attached all the files including a screenshot of my profiling of the code which is very slow. I have also attached the workspace needed. Everything is in the link above.
Other things you need to know for running the code are in the "Recommended Instruction for Executing Code.txt" file.
I will really appreciate it if you can help me.
Thanks a lot

4 commentaires

Mario Malic
Mario Malic le 24 Juil 2020
Recommended Instruction for Executing Code.txt doesn't exist.
Fego Etese
Fego Etese le 24 Juil 2020
I truly apologize for this, i didn't know it wasn't in the folder i uploaded. I have made the changes now.
Thank you, I appreciate it
Mario Malic
Mario Malic le 25 Juil 2020
Modifié(e) : Mario Malic le 29 Juil 2020
Is double necessary?
% convert to single
target = single(matchImg);
template = single(enrollTemplateImage);
Change them both to single gives some improvement.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc % with double, .png image
Elapsed time is 1.675404 seconds.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc % with single, .png image
Elapsed time is 1.418316 seconds.
Also, the fact that you run your code in OneDrive folder, maybe .mlx generates some files while running and the OneDrive sync produces the problems.
Also, consider the difference in outputs of matchImg when you are supplying it with different images. Finger1A will not generate same output as Finger1E even though they are the same format.
Fego Etese
Fego Etese le 25 Juil 2020
Thank you so much Mario Malic for your help.
I'll test out the single and see if it givesa ny improvements. Sorry for my late reply, I've been away from my computer for a while now.
I don't run the code from my Onedrive normally, I just uploaded it there so i can share it here. I will also take up yur siggestion and convert the images to png. Thanks so much

Connectez-vous pour commenter.

Réponses (1)

per isakson
per isakson le 25 Juil 2020
Modifié(e) : per isakson le 7 Août 2020
Caveat: I've never seriously used the Live Editor.
I've undertaken the following steps
  1. uploaded your files to a new folder, which I made the current folder
  2. read "Recommended Instruction for Executing Code.txt" file.
  3. loaded workspace.mat
  4. converted znccPrf.mlx to znccPrf.m (an old time m-file)
  5. changed imread('finger1E.tif'); to imread('finger1E.png'); since there was no tif-file in the upload.
  6. profiled posZNCC = znccPrf(enrollTemplateImage);. The statement, meanRef=mean(mean(ref)); dominated together with "self time".
  7. replaced mean(mean(ref)); by mean(ref,'all');. That helped a bit. And sum(reshape(ref,1,[]))/numel(ref); is still a bit faster.
Finally I run
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc
Elapsed time is 2.003947 seconds.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc
Elapsed time is 2.024163 seconds.
The code I ran differs from the code of your profiling screenshots. The profiling results differs dramatically.
I use Matlab R2018b, Win10 and a fairly new desktop PC.
In response to comments
I've made a few more changes to your code and achieved close to a doubling of the speed compared to your function.
I use the uploaded png-file, finger1E.png, in both cases. (Is the code intended to process png or tif files?)
Furthermore, I use the lines
for y = 1:rTem % <<<<<<<<<<<<
for x = 1:cTem % <<<<<<<<<<<<
in both functions, since I believe that's the relevant case. Why do you use " = 1:2 " in some cases?
The script
%%
tic, posZNCC = znccPrf( enrollTemplateImage ); toc
tic, posZNCC_poi = znccPrf_poi( enrollTemplateImage, 'png' ); toc
posZNCC, posZNCC_poi
%%
t = bench()
outputs
>> fego
Elapsed time is 2.611501 seconds.
Elapsed time is 1.430623 seconds.
posZNCC =
209 103 0.76105
posZNCC_poi =
1×3 single row vector
209 103 0.76105
t =
0.081743 0.078711 0.01291 0.083046 1.2827 2.0448
The two return the same value of posZNCC, that is within the precision displayed by format short. The last line describes the performance of my Matlab+PC. The first four numbers are good the last two are poor.
Measures to improve the speed
Use single instead of double. It introduces rounding errors, which I believe are acceptable.
% convert to double
target = single(matchImg); % <<<<<<<<<<<<
template = single(enrollTemplateImage); % <<<<<<<<<<<<
Split the calculation of the temporary variable, ref, into two steps. This should decrease the need for shuffling data.
for jj = 1 : (rTar - rTem + 1)
refjj = target( jj:(jj+rTem-1), : ); % <<<<<<<<<<<<
for ii = 1 : (cTar - cTem + 1)
ref = refjj( :, ii:(ii+cTem-1) ); % <<<<<<<<<<<<
Chose a more efficient code to calculate mean of a matrix. In a reply to Walter's question I showed a comparison between six different ways to calculate the mean.
meanRef = sum(ref(:))/numelTem; % <<<<<<<<<<<<
Vectorize the two inner loops
tmT = template - meanTem; % <<<<<<<<<<<<
rmR = ref - meanRef; % <<<<<<<<<<<<
sum1 = sum( reshape( tmT.*rmR, [],1 ) ); % <<<<<<<<<<<<
sum2 = sum( reshape( tmT.*tmT, [],1 ) ); % <<<<<<<<<<<<
sum3 = sum( reshape( rmR.*rmR, [],1 ) ); % <<<<<<<<<<<<
ZNCC = sum1 / (sqrt(sum2) * sqrt(sum3)); % <<<<<<<<<<<<
That was a lot of work and it didn't even double the speed. (The two functions are attached.)
One more measure (2020-08-07)
The execution time of the script** increases faster than linear with the size of the image, i.e. with the size of the variable matchImg in the code. The image, finger1E.png, has a fairly large white areas to the left and right. Removing most of that white area decreases the execution time substantially without affecting the result.
I made this little test
>> pic = 'finger1E';
>> crop = false;
>> tic, [ posZNCC, P ] = znccPrf_poi_v2( enrollTemplateImage, pic, crop ); toc
Elapsed time is 1.413413 seconds.
>> crop = true;
>> tic, [ posZNCC, P ] = znccPrf_poi_v2( enrollTemplateImage, pic, crop ); toc
Elapsed time is 0.802015 seconds.
All of the measure described above are implemented in znccPrf_poi_v2. With crop==false the elapse time, 1.41sec, is close enough to 1.43sec reported above for znccPrf_poi. With crop==true the leftmost 90 and rightmost 58 columns of the 374x388 matchImg are removed by
matchImg = imread('finger1E.png');
if nargin==3 && crop
matchImg = matchImg( :, 91:330 );
end
**) should be function

35 commentaires

Walter Roberson
Walter Roberson le 25 Juil 2020
How about mean(ref(:)) for timing ?
per isakson
per isakson le 25 Juil 2020
Modifié(e) : per isakson le 26 Juil 2020
(:) is a bit faster than reshape( ___,1,[]).
The differences overall are too large to my taste.
Fego Etese
Fego Etese le 25 Juil 2020
Hello Per Isakson, please can you show me where to access the code so i can also run it on my system. I reall appreciate your efforts and time.
I used 2018a to run the code and that's where i got the profiling from. My laptop is win 10 also and core i7, but an old generation version.
Please I don't actually understand the step you took at no 7.
Thanks a lot!
Fego Etese
Fego Etese le 28 Juil 2020
Hello Per Isakson, please I really need your help with the edited code, so i can run a profiling myself.
Thank you so much
Mario Malic
Mario Malic le 28 Juil 2020
Modifié(e) : Mario Malic le 28 Juil 2020
I would suggest you to try to do this on a different machine, since there might be something wrong with your laptop.
What he did on step 7: he calculated mean value of a matrix by other ways. I also did it on live editor and the code was done in similar time as his.
Also, if you are importing an image with significantly higher pixel count can result in much longer solving time.
Fego Etese
Fego Etese le 28 Juil 2020
Modifié(e) : Fego Etese le 28 Juil 2020
Ohh i see, thanks Mario Malic, but in my profiling the mean was not the only part that held the time, there were three other lines according to the profiling that did so too, and I have no idea why.
Mario Malic
Mario Malic le 28 Juil 2020
If you consider that his computer spent 1.5s on that line out of 2s of total time, then execution of these lines take less than 0.5s and are irrelevant.
As I said, if you were working with .tiff image, it is not the same as with .png due to the reasons I mentioned in my comment.
If you are working with different code or different files (which may explain for the difference in our times vs yours), it is hard for us to troubleshoot where the problem is.
Fego Etese
Fego Etese le 28 Juil 2020
Alright, I understand. I'll change the files to png and run a test.
As for the code, it is the same code that I'm running that i sent.
Thanks Mario Malic. I'll let you know if there's any improvement
Fego Etese
Fego Etese le 28 Juil 2020
Yes those lines are supposed to be irrelevant in the time they take to run, i just can't understand why it's different for me and why it's hanging there. I'll try again with the png version
Fego Etese
Fego Etese le 28 Juil 2020
Modifié(e) : Fego Etese le 28 Juil 2020
Hello Mario Malic, these are my new results after using the solution from Per Isakson. The mean part has been reduced but there is one more area that eats up a lot of time. That is line 29, please what do you suggest I can do about this line.
The line is just trying to extract the template size from the bigger image to do the correlation on. What could be an alternative to do this faster?
Thank you so much.
Fego Etese
Fego Etese le 28 Juil 2020
By the way everything takes 6 seconds now, I still don't know why I can't get mine to be similar to yours. Maybe it's the laptop, i'm guesssing
Mario Malic
Mario Malic le 28 Juil 2020
I suggest you to your upload your .mlx file here or to the onedrive link and upload the same image that you are using before continuing anything from my side.
Fego Etese
Fego Etese le 29 Juil 2020
Modifié(e) : per isakson le 29 Juil 2020
So I tried running Per Isakson's code and i got an error but I don't know where it's coming from
Fego Etese
Fego Etese le 29 Juil 2020
Modifié(e) : Fego Etese le 29 Juil 2020
Per Isakson, I really appreciate the time you've taken out to help me honestly, but please what can i do about this error here, I don't understand it and I've been trying to.
It seems like the mean function is being passed an invalid value "all", I'm not sure
Replace
mean(template, 'all')
with
mean(template(:))
If I recall correctly, the 'all' option was added in the release after you are using.
Fego Etese
Fego Etese le 2 Août 2020
Thanks Walter Roberson, I just did that and ran another profiling, it takes up to 19 seconds this time around, here is the screenshot
Fego Etese
Fego Etese le 2 Août 2020
I don't know why it's working this way for me, could it be that version 2018a is slower than others?
Fego Etese
Fego Etese le 3 Août 2020
These are the files i used
Mario Malic
Mario Malic le 3 Août 2020
I have R2018a.
Finger1A and znccPrf_poi Elapsed time is 12.098890 seconds.
Finger1E and znccPrf_poi Elapsed time is 3.686021 seconds.
Even though the images look similar, one is grayscale and the other is truecolor.
If you uncomment the figure line, you'll see that for truecolor image you will do two more fingerprints, or two extra calculations, was that your intention?
Fego Etese
Fego Etese le 3 Août 2020
Yes, you are right, the image was truecolor, I didn't know that, but even after I extracted the gray part it still ran for 20 seconds
per isakson
per isakson le 7 Août 2020
Make your screenshot easier to read with my old eyes, by placing the outputs below the code in the Live Editor. Use the icon in the upper right corner
per isakson
per isakson le 7 Août 2020
"Please I don't actually understand the step you took at no 7."
There are many ways in Matlab to calculate the average of all values of a matrix. I've tried a handful. The speed is 1 to 3 between slowest and the fastest, as I displayed in an answer to a comment by Walter. Your code spends a large portion of the time calculating the average of matrices. Thus, I replace mean(mean(ref)) (which is the slowest) by a faster way.
per isakson
per isakson le 7 Août 2020
Modifié(e) : per isakson le 7 Août 2020
"[...] I still don't know why I can't get mine to be similar to yours"
We have to run exctly the same code with exactly the same image file to be able to compare execution times in a meaningful way. And measure time in the same way, tic/toc or profile.
I believe that the major part of the differences that you report are because of diffences in code and image file and that only a minor part are because of our different hardware.
"[...] could it be that version 2018a is slower than others?" Mario Malic runs R2018a. I would be surprised if the diffence between R2018a and R2018b explains more than a few percent.
It's confusing that the script named znccPrf_poi exists in many versions.
What exactly does "I'll change the files to png" and "I extracted the gray part" mean?
Thank you Per Isakson, sorry about the screenshot, I'll correct that next time.
I ran exactly the same code as you put up here on this answer page and I got drastically differernt results, which is why I'm puzzled, whether it could be my MAtlab software that's causing it, because I also ran a code by Mario Malic and it took longer for me than him.
When i said i extracted the gray part I meant i just used one of the dimensions of the image, the red, green or blue
img(:,: 1);
And i also tried
rgb2gray(img)
for the code, all gave me the same results
Fego Etese
Fego Etese le 7 Août 2020
Modifié(e) : Fego Etese le 7 Août 2020
It keeps on hanging at the averaging part and three other lines which are supposed to not even take up to a second, this keeps happening for some reason.
Sorry if this screenshot is unreadable, this is the last profiling screenshot i uploaded here recently, but i wanted to show you the lines it keeps hanging at
per isakson
per isakson le 7 Août 2020
And see my addition to the answer
Fego Etese
Fego Etese le 7 Août 2020
Thank you for your time Per Isakson, I really appreciate it.
The 1:2 was an error, it was when i was testing something in the code, the 1:rTem and 1:cTem are the right ones. Also please where can i get the zncc_poi_v2.m file that you just ran so i can run it and check the time it takes on my system. If you don't mind please let me have the image you used also.
I understand the cropping part you did, but i don't know if it will affect my code because i am to run the code in a for loop 20 times to get the max correlation of 20 different rotation values, so the image will be rotated and if it is cropped i'm not sure how the results will react but i wull give it a try and let you know how it goes.
To be hinest, if i can just get the 1.4 seconds i see you having, i will be fine with it, mine just takes unnecesarily longer for some reason. Let me try the version 2 and see how it goes
Thanks
I run (see my answer regarding znccPrf_poi_v2 )
>> profile on
>> [ posZNCC, P ] = znccPrf_poi_v2( enrollTemplateImage, 'finger1A', false );
>> profile viewer
where finger1A.png is handle by
case 'finger1A'
matchImg = imread('finger1A.png');
matchImg = rgb2gray( matchImg );
if nargin==3 && crop
matchImg = matchImg( :, 91:end );
end
Excert from the profiling result
Your code is running the inner loop more than three times as many iterations as mine. The reason is probably with the value of cTar.
per isakson
per isakson le 7 Août 2020
Modifié(e) : per isakson le 7 Août 2020
I knew there was a good reason to "Caveat: I've never seriously used the Live Editor."
Fego Etese
Fego Etese le 7 Août 2020
Please can you upload the znccPrf_poi_v2, so i can run it on my system too?
Yes,I have read this before, how live scripts run slower, I still use it because ot it's UI that is nice, but I'll try th m version for this code instead
Thanks
per isakson
per isakson le 7 Août 2020
"Yes, I have read this before, how live scripts run slower"
How come you have not tested if this affects your function and how come you didn't provide this reference in your question or a comment? You let people work with your problem, without providing this potentially important information. That's not fair!
Fego Etese
Fego Etese le 7 Août 2020
Modifié(e) : Fego Etese le 7 Août 2020
No please don't take it the wrong way, I said I read it before, but I haven't seen it actually happen. The first time i ran the ZNCC code it was in an .m file not a .mlx file and it provided the same results so i just stuck to the .mlx version because of the UI. Anytime my code runs slow, i first switch the .m version to see if it can run faster there, else i keep on using the .mlx version
I also ran your zncc_poi version 1 which was in an .m file to give the last profiling screenshot which i showed you earlier, so it isn't giving me any different result from running in an .mlx file.
per isakson
per isakson le 10 Août 2020
Modifié(e) : per isakson le 10 Août 2020
"I said I read it before, but I haven't seen it actually happen"
The problem is that I can only know what you actually written in this thread of comments. The thread has been going on for two weeks and contains thirtytwo comments. I don't remember all the details and I have no good ideas of why you see such poor performance.
You may very well have read "live scripts run slower" and compared the performance of mlx-functions and m-functions, but I cannot know since you have not reported it here. Your justification "because of it's UI that is nice" sounds absurd to me when the focus should be on performance.
Please don't assume that I deduce from the profiling screenshots whether they show results from mlx- or m-functions.
per isakson
per isakson le 10 Août 2020
You need to approach the problem more systematically. Don't cut corners.
What do you know for sure?
What could be the reasons for the poor performance? There were some good hypotheses in a recent comment (now deleted) by Mario Malic. Make your own list and test one hypothesis at a time. Document the results.
znccPrf_poi_v2 differs from znccPrf_poi only regarding the cropping of white area. Thus, I think it's better that you concentrate on reproducing my result with znccPrf_poi.
per isakson
per isakson le 10 Août 2020
Modifié(e) : per isakson le 10 Août 2020
I looked at your screenshots of the for ii = 1 : (cTar - cTem + 1) loop. I noticed that the number of iterations differs between the first and the rest. I find nothing in your July 28 comments that explains the difference.
  • Fego Etese on 28 Jul 2020 shows 96672 iterations
  • Fego Etese on 28 Jul 2020 shows 332576 iterations
  • Fego Etese on 2 Aug 2020 shows 332576 iterations
  • Fego Etese on 7 Aug 2020 shows 332576 iterations
The screenshot of my 7 Aug 2020 comment shows 96672 iterations.
P.S. The dates are the dates displayed here. Local time may add or substract a day.

Connectez-vous pour commenter.

Produits

Version

R2018a

Question posée :

le 24 Juil 2020

Modifié(e) :

le 10 Août 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by