Speeding up fread for random position in big file

8 views (last 30 days)
dear community,
at the moment i need to read one or multiple image(s) from a big data file. since I build a live data viewer and the file is too big (20-80 GB) too read all at once i read it one by one. at the moment I use the following command
% this only once
fid=fopen('myfile.mraw','r');
% and this each time I need a specific or multiple image(s)
first_frame=251674 % exemplary frame number
numOfFrames=1 % can be 1 or 3, user dependent (3 only if i apply a temporal median filter of length 3)
bitOrder = 'b';
color = 1; % just gray in my case and not color
N=[320 384];
pixels=320*384;
ColorBit=12;
I=zeros(Pixels*numOfFrames,1,'uint16');
start = (first_frame-1)*pixels*ColorBit/8;
fseek(fid,start,'bof');
I = fread(fid,Pixels*numOfFrames,'ubit12=>uint16',bitOrder);
% convert to image matrix
A(:,:,:)=permute(reshape(I,[N numOfFrames]),[2 1 3 4]);
is there an easy way to speed up the fread call? compared to the the further displaying etc. this line takes about 97% of the time.
when reading one frame per call I dont get more than 18--20 frames out per second. at least 50 fps would be nice
unfortunately i cannot supply the mraw file due to its size.
reading more frames at once is not possible here, since first_frame can change e.g. by 1000 between calls.
The file contains high speed recordings at 20k fps, together with other data.
best regards
Jonas
  10 Comments
Jonas
Jonas on 23 Jun 2022
the syntax is alright, this way it also used in the doc of fread. the resulting image(s) are also right, the command works as intended and the images are correct

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 24 Jun 2022
In my experience, you can be more efficient on conversion of 12 bit data than what is done by fread()
If memory serves
raw = uint16(fread(fid, [3 size], '*uint8'));
firsts = First12(raw(1,:),raw(2,:));
seconds = Second12(raw(2,:),raw(3,:));
where First12 and Second12 are lookup tables such that First12(A,B) = A*256 + bitand(B, 240)/16) and Second12(A,B) = bitand(A,15)*256 + B
One might think that it is more efficient to just do those numeric computations for the whole array of raw data, without using lookup tables -- and if you were to write a bit of C code to do the conversion, it might well be more efficient to do that. But my memory is that in MATLAB, it turns out to be more efficient to pre-compute the values and use array lookup -- since the pre-computation is over a relatively small array compared to the input values, and table lookup is comparatively fast since it only involves address arithmetic and pulling values out of memory. Applying those bitand() and multiplications over the whole large array is, if my memory is correct, slower at the MATLAB level (but again, some C code might well rip through it.)
  5 Comments
Walter Roberson
Walter Roberson on 27 Jun 2022
When I was designing the code with 2D lookup tables, I could have used linear indexing by multiplying one input by a constant and adding the other -- but that would have required an explicit multiply and add. I figured that it would probably be faster to use 2D indices, with the implicit lookup, as the Execution Engine would be able to convert that into low-level multiply-and-add instead of explicit. It might even possibly be able to convert to base register + offset machine code.

Sign in to comment.

More Answers (2)

Steven Lord
Steven Lord on 23 Jun 2022
Since your data is Big Data (too large to fit in memory all at once) you may want to investigate creating a datastore for your files and using that datastore to create a tall array on which to operate. See this section of the documentation for more information about the tools and techniques you can use for working with tall arrays and datastore objects.
  9 Comments
Jonas
Jonas on 30 Jun 2022
actually that was not my decision. I just got those 12bit packed files to work with and need to make the best out of it.

Sign in to comment.


Image Analyst
Image Analyst on 23 Jun 2022
There is also memmapfile - a function older than the datastore family of functions.
Maybe @Steven Lord can explain the differences and when to use each. 🙂🤞

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by