Reading a text file and rows with data
Afficher commentaires plus anciens
Hi,
I need to read the text file shown in the image below using a MATLAB script.I have tried the code below and I appreciate if someone can guide me to make it working. I have attached the text file here.
Thanks in advance.
myFolder = 'C:\Users\Desktop\'
filePattern = fullfile(myFolder, '*.txt');
csvFiles = dir(filePattern);
fmt='%d %4d/%2d/%2d %2d:%2d %d %*[^\n]';
for i=1:length(csvFiles)
fid = fopen(fullfile(myFolder,csvFiles(i).name));
c=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1,'delimiter','\t'));
fid=fclose(fid);
end

1 commentaire
dpb
le 14 Juin 2016
Two problems, one easy, another "not so much"...
a) You can't mix types in 'collectoutput' and the floating point value fails with the %d format string so need to switch all %d to %f in the format string.
b) The file is NOT tab-delimited, it's the bane of C text input, blank-delimited with missing data. This is essentially impossible to deal with in C (and hence Matlab since it uses C formatted i/o routines derived from fscanf and friends). You can see the is make the above correction and return the intermediary result from textscan --
>> fmt='%f %4f/%2f/%2f %2f:%2f %f %*[^\n]';
>> d=cell2mat(textscan(fid,fmt,'headerlines',18,'collectoutput',1));
>> whos d
Name Size Bytes Class Attributes
d 1x7 56 double
>> d
d =
1.0e+04 *
2.3002 0.1997 0.0010 0.0001 0 0 0.0000
>> d(end)
ans =
0.2928
>>
This shows read the first record correctly; what happened that there's only one record???
>> frewind(fid) % we'll try again from the top...
Now read but find out where the file pointer is afterwards...
>> [d,n]=textscan(fid,fmt,1,'headerlines',18,'collectoutput',1);
>> n
n =
463865
>>
??? That's an awfully big number for one record, isn't it!!!???
C:\ML_R2012b\work> dir 023002_Q_1997.txt
Volume in drive C is unlabeled Serial number is BC9D:AAD0
Directory of c:\ml_r2012b\work\023002_q_1997.txt
023002_q_19► 463865 6/13/16 18:09
463,865 bytes in 1 file and 0 dirs 466,944 bytes allocate
113,018,556,416 bytes free
C:\ML_R2012b\work>
What we see is that's identically the file size; with no delimiter the skip-end-of-line went all the way to the end of the file.
Réponse acceptée
Plus de réponses (2)
Oh, one thing I noted also when checking on the delimiter; the EOL marker is '\r'; wonder what would happen if skip for it explicitly instead of the default '\n'?
>> fmt='%f %4f/%2f/%2f %2f:%2f %f %*[^\r]';
>> [d,n]=textscan(fid,fmt,'headerlines',18,'collectoutput',1,'endofline','\r')
d =
[2x7 double]
n =
1171
>>
OK, read the second record but croaked on missing value...what if create specific for it, too?
>> frewind(fid);
>> fmt1='%f %4f/%2f/%2f %2f:%2f %*[^\r]'; % format w/o the last float
>> [d,n]=textscan(fid,fmt1,11,'headerlines',18,'collectoutput',1,'endofline','\r')
d =
[11x6 double]
n =
4080
>>
Aha! Now we're getting somewhere; all we have to do is to wrap the two calls in a loop --
d=cell2mat(textscan(fid,fmt,1,'headerlines',18,'collectoutput',1,'endofline','\r')); % 1st record only
d=[d;[cell2mat(textscan(fid,fmt1,11,'collectoutput',1,'endofline','\r')) nan(11,1)]]; % next group
while ~feof(fid)
d=[d;[cell2mat(textscan(fid,fmt,1,'collectoutput',1,'endofline','\r'))];
d=[d;[cell2mat(textscan(fid,fmt1,11,'collectoutput',1,'endofline','\r')) nan(11,1)]];
end
While wouldn't normally dynamically allocate like this, unless the file is extremely large this should be "fast enough". It it does bog down excessively with time before finishing, preallocate a large array, keep index of rows read and store them appropriately.
5 commentaires
Damith
le 15 Juin 2016
dpb
le 15 Juin 2016
The pattern breaks, unfortunately--record
023002 1997/10/09 00:00 A
is missing the floating point value.
If you can't fix the file to be consistent, you'll likely have to read it as a character string line-by-line and parse each line to find out what's there or read the whole file as a character array in then process substrings by the fixed-width of each column.
Or, add an actual delimiter such that missing fields can be identified automagically.
Well you can certainly count on the government to be there to "help"... :(
Only can eliminate a row by finding out which rows they are which is back to the line-by-line parsing.
All in all would likely be simpler to insert the delimiter; it's really quite simple to read a file as a 'blob' of characters.
Damith
le 15 Juin 2016
Shameer Parmar
le 17 Juin 2016
For reading any text file.. try this code..
clear all;
count = 1;
fid = fopen('ascii_file.txt');
tline = fgetl(fid);
while ischar(tline)
if (tline ~= -1)
data(count,:) = {tline};
else
data(count,:) = {''};
end
count = count + 1;
tline = fgetl(fid);
end
fclose(fid);
replace 'ascii_file.txt' with your filename..
"data" will be the output (multi dimensional array), which store all your data from txt file.
Catégories
En savoir plus sur Data Type Conversion dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

