Fastest Way to write data to a text file - fprintf

70 vues (au cours des 30 derniers jours)
Brian
Brian le 2 Août 2013
I am writing a lot of date to a text file one line at a time (1.7 million rows, 4 columns) that is comprised of different data types. I'm wondering if there is a better way to do this than 1 line at a time that might yield much faster results.
Here is what I'm doing now.
ExpSymbols = Char Array
ExpDates = Numeric Array
MyFactor = Numeric Array
FctrName = Char Array
ftemp = fopen('FileName','w' );
for i = 1:length(MyFactor)
fprintf(ftemp, '%s,%i,%f,%s\r\n',ExpSymbols(i,:), ExpDates(i,1), MyFactor(i,1),[FctrName '_ML']);
end
fclose(ftemp);
Thanks in advance,
Brian

Réponse acceptée

Jan
Jan le 2 Août 2013
You can try to suppress the flushing by opening the file in the 'W' instead of the 'w':
ftemp = fopen('FileName', 'W'); % uppercase W
Fmt = ['%s,%i,%f,', FctrName '_ML\r\n'];
for i = 1:length(MyFactor)
fprintf(ftemp, Fmt, ExpSymbols(i,:), ExpDates(i), MyFactor(i));
end
fclose(ftemp);
  9 commentaires
Brian
Brian le 5 Août 2013
Modifié(e) : Brian le 5 Août 2013
You're right, saving the variables by themselves is much quicker than writing to a flat file. I changed my code to write to C:\Temp (as you suggested above) and the save took .97 seconds and the load took .33 seconds. The formatted flat file is 62 MB in size and the .mat file is only 15MB or so. I do need a properly formatted file for the other system to read as it can't read .mat files.
All fields need to be in one file but it sounds like you're saying that the writing of mixed data types is what's making the write unnecessarily slow. Can I write one data type at a time to the same file using a loop structure for each data type?
dpb
dpb le 5 Août 2013
A) Can you offload the formatting from this code to a second one that processes the .mat files and writes the formatted ones? Won't save any overall but moves it to a different place where the bottleneck might not be so evident? For example, you could have a second background process doing that conversion while the primary analyses are done interactively? All depends on the actual workflow as to whether helps or not, of course.
B) Can your target app read the data variables sequentially one after the other instead of all a record at a time as you're currently writing them? If so, sure you can write each w/o any loop at all and it will likely be faster by at least a measurable amount as Jan suggests.
C) You might just see what the text option of save does in comparison for speed--don't know it'll help but what they hey...

Connectez-vous pour commenter.

Plus de réponses (1)

dpb
dpb le 2 Août 2013
Modifié(e) : dpb le 3 Août 2013
It's a pita for mixed fields--I don't know of any clean way to mix them in fprintf c
I generally build the string array internally then write the whole thing...
cma=repmat(',',length(dates),1); % the delimiter column
out=[symb cma num2str(dates) cma factor cma names];
fprintf(fid, '%s\n', out);
fid=fclose(fid);
names is a placeholder for the FactorName that I guess may be a constant? If so, it can be inserted into the format string as Jan assumed; if not needs to be built as the column of commas to concatenate however it should be.
  6 commentaires
Brian
Brian le 5 Août 2013
Just to convert my two numeric arrays to string takes 55 seconds. This is slower than writing the file with the mixed data types using fprintf and the 'W' argument. I'm still not sure what you are referring to when you talk about "stream." I'm not familiar with that.
dpb
dpb le 5 Août 2013
Also called "binary". It's unformatted i/o which has the benefits for speed of
a) full precision for float values at minimum number of bytes/entry, b) eliminates the format conversion overhead on both input and output
doc fwrite % and friends
or if could stay in Matlab then
doc save % and load is only slightly higher-level
The possible disadvantage is, of course, you can't just look at a file and read it; but who's going to manually be looking at such large files, anyway?

Connectez-vous pour commenter.

Catégories

En savoir plus sur Environment and Settings dans Help Center et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by