Effacer les filtres
Effacer les filtres

Data is not saving to the workspace

9 vues (au cours des 30 derniers jours)
Aaron Smith
Aaron Smith le 10 Fév 2017
Commenté : Stephen23 le 21 Fév 2017
I have a large text file composed of a single row of 52480000 numbers separated by semicolons. I'm attempting to organize the data into 51250 rows of 1024 numbers and then separate this into distinct blocks of 1025 x 1024. The numbers need to stay in the same order they were in in the original file (with every 1025th number being the start of a new row) I have tried using a while and if loop.
R = 51250;
C = 1024;
fid = fopen( 'TEST_A.asc');
k = 0;
while ~feof(fid)
z = textscan( fid, '%d', R*C, 'EndOfLine', ';');
if ~isempty(z{1})
k = k + 1;
s = fprintf( 'TEST_A.asc', ';');
dlmwrite( s, reshape( z{1}, 1025, []), ';')
end
end
fclose(fid);
This code does not create an initial cell of 52480000 numbers, which means that none of the subsequent data sets (s & z) are created in the workspace. The problem is that if I textscan the data into Matlab before formatting it, the file creates a memory error. Does anyone notice anything that I don't about this code or have any pointers?
  26 commentaires
Stephen23
Stephen23 le 15 Fév 2017
Modifié(e) : Stephen23 le 15 Fév 2017
You could register with dropbox, mediafire, google drive, or one of the many other file sharing websites, and send me the link of the file (via my profile page: please also include a link to this thread otherwise the email will get deleted automatically).
Stephen23
Stephen23 le 15 Fév 2017
Modifié(e) : Stephen23 le 15 Fév 2017
@Aaron Smith: I received your message. I will have a look a little later.

Connectez-vous pour commenter.

Réponse acceptée

Stephen23
Stephen23 le 15 Fév 2017
Modifié(e) : Stephen23 le 15 Fév 2017
Thank you for the file. What did I learn from the actual data file: that it is not "composed of a single row", but in fact there are 51200 rows in the file that I received.
Why is this important? Because computers are stupid, and they do exactly what they are told to do. Knowing how to read a file correctly requires knowing what format the file has. In this case it is also quite handy for us, because it is trivial to read and write lines without much processing.
The code below worked correctly for me, reading the 200 MB file, and creating 50 smaller files with the rows following the same order as the original file.
sbd = 'temp';
f2d = fopen(fullfile(sbd,'temp_01.asc'),'wt');
f1d = fopen(fullfile(sbd,'TEST_A.asc'),'rt');
k = 0;
while ~feof(f1d)
str = fgetl(f1d);
if sscanf(str,'%d')==1
k = k+1;
fclose(f2d);
fnm = fullfile(sbd,sprintf('temp_%02d.asc',k));
f2d = fopen(fnm,'wt');
end
fprintf(f2d,'%s\n',str);
end
fclose(f1d);
fclose(f2d);
Note that:
  1. the size of the output matrices is 1024x1025 (because there are 1025 numbers per line). This is correct because the first number of each line is simply a line count (check the files and you will see).
  2. the lines are exactly the same as the original file.
  3. MATLAB hold one line at a time: the lines are simply read from the large file and written directly to a new file.
  4. as a result: no matrix, no converting from string to numeric and back to string.
  5. it is slow because the file is large... reading and writing 51200 lines of 1025 numbers each will take some time.
  7 commentaires
Aaron Smith
Aaron Smith le 21 Fév 2017
Thanks Stephen, that code works as far as I can see. What may I ask are the two ~ in the code doing?

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Text Data Preparation dans Help Center et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by