Openning 14000 tif files...

1 vue (au cours des 30 derniers jours)
Jobi
Jobi le 18 Mar 2013
Hello,
I am trying to optimize the time taken to open ~14000 tif files of about 400KB each.
I have tried different approaches essentially based on killing what I do not need and monitoring that with the profiler. But the results seem inconsistent from one run to another.
Essentially my test is the following: I evaluate 3 different ways of operating (from the worst to the best I hope...), for the last ones I have copied rtifc.mexw64 in the current folder.
%%imread
im = zeros(424,424,14000,'uint16');
tic
for k = 1:length(fname)
tmpf = [Folder fname{k}];
tim = imread(tmpf);
im(:,:,k) = tim;
end
T = toc;
[T T/14]
clear im tim tmpf
%%feval tifread
im = zeros(424,424,14000,'uint16');
tic
tf = imformats('tif');
for k = 1:length(fname)
tmpf = [Folder fname{k}];
tim = feval(tf.read, tmpf,1 );
im(1:424,1:424,k) = tim;
end
T = toc;
[T T/14]
clear im tim tmpf
%%rtifc
im = zeros(424,424,14000,'uint16');
tic
tmp.index =1;
tmp.PixelRegion = {[1 424],[1 424]};
tmp.info = imfinfo([Folder fname{1}]);
for k = 1:length(fname)
tmp.filename = [Folder fname{k}];
[tim,trash1,trash2] = rtifc(tmp);
im(1:424,1:424,k) = tim;
end
T = toc;
[T T/14]
clear im tim tmpf
From there I have 2 points:
1. There is no big improvement between the 3...
  • 70s 67s 66s
  • 69s 63s 64s
  • 66s 64s 61s
2. From one run to another I have sometime massive differences for the 3 methods: 29s 18s 16s
Do you have any suggestion? What am I doing wrong? THANK YOU!!! :)
  1 commentaire
Jan
Jan le 18 Mar 2013
im(1:424,1:424,k) = tim; is slower than im(:,:,k) = tim;, but this will not effect the total runtime significantly.

Connectez-vous pour commenter.

Réponse acceptée

Walter Roberson
Walter Roberson le 18 Mar 2013
Part of the time in reading is getting the files into operating system memory cache. If you have read moderately sized files recently, then you might not be necessary to fetch them from hard disk to main memory; instead a memory-to-memory copy might be all that is needed.
  8 commentaires
Jan
Jan le 18 Mar 2013
@Jobi: What is your actual problem? Does it matter if you wait for 30 or 60 seconds?
Jobi
Jobi le 19 Mar 2013
Yes because this code will run on different computers with different hardware configurations, but moreover the size of the tif (mainly pixels number) is not constant. This case with 14000 tif is not even our worst case.

Connectez-vous pour commenter.

Plus de réponses (2)

Sean de Wolski
Sean de Wolski le 18 Mar 2013
If you have the Parallel Computing Toolbox; how about using parfor?
doc parfor
If you do not have the PCT, I'm sure your friendly Sales Rep could set you up with a trial.
  1 commentaire
Jobi
Jobi le 18 Mar 2013
Hi have tried with 4 workers (4cpu) but it had actually the opposite effect (>300s...) :( If I am correct I just need to add matlabpool and parfor rather than for in the code?

Connectez-vous pour commenter.


Jan
Jan le 18 Mar 2013
accessing the hard disk is influenced by many different factors: Disk fragmentation, defragmentation tools, other jobs accessing the disk, weak blocks which are moved transparently, virus checkers which check modified files at the first access, downloads of updates in the background, other tasks which swap data to the disk, etc. Therefore a difference of 50% is not very surprising.
The best setup for a speed measurement with disk access is using a dedicated disk (not partition!) for the data.
  1 commentaire
Jobi
Jobi le 19 Mar 2013
ok I just thought there would be a way to set priorities. Or tricks that I have not thought about.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Parallel Computing Toolbox dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by