how to skip lines that start with a certain character while reading a text file

27 vues (au cours des 30 derniers jours)
I have a text file with two coloumns for a certain amount of rows. The coloumns are then divided from a text line that start with #, how can I load only the data by removing the # line?
  4 commentaires
christian_00
christian_00 le 18 Juin 2024
I'm sorry, first time here, I put it in the question page below the "release" option but maybe others can't see it
dpb
dpb le 18 Juin 2024
Hmmmm....I don't see the release information on the Q?; I do use the compact format, but I'd think it still should show it if user specified it. I'll have to open another window and see if the alternate....oh! I see; it's over there in the RH column with all that other stuff I never pay attention to, not part of the Q? itself. I'll have to try to remember to go look, but nobody else caught it, either, including the MATHWORKS employee....so it clearly isn't in the most suitable location.
Anyway, did you see my followup Answer given the release? readtable should solve your problem as a one-liner.

Connectez-vous pour commenter.

Réponse acceptée

dpb
dpb le 18 Juin 2024
With the new information of R2018 that predates all the answers initially given, the easiest high-level toolset will be to use readtable; it goes back to R2013
tData=readtable('yourfile.txt','CommentStyle','#');
Alternatively, as mentioned in earlier sidebar conversation, reverting to the venerable textread would probably be my second choice even though it is now deprecated.
  2 commentaires
christian_00
christian_00 le 18 Juin 2024
It worked, thank you very much
dpb
dpb le 18 Juin 2024
If this solved your problem, please "ACCEPT" the answer to let folks know if nothing else...

Connectez-vous pour commenter.

Plus de réponses (4)

dpb
dpb le 18 Juin 2024
@Taylor's solution will work, but leaves you with the need to convert the string data to numeric values to use it. For a direct solution, try
data=readmatrix('yourfile.txt','CommentStyle','#');
See <readmatrix> for details. Also readtable supports the same option if a table were desired instead of the array; also particularly if the file does have variable names as the first record.

Taylor
Taylor le 18 Juin 2024
I would just load the data as a string and use the erase function to remove the "#"
  6 commentaires
Taylor
Taylor le 18 Juin 2024
@dpb Update from development on the readlines function: "The other functions mentioned are "formatted" text function that expect some structure of the data. readlines is meant simply to read the lines in the file. Its interface is kept minimal on purpose."
dpb
dpb le 18 Juin 2024
Modifié(e) : dpb le 20 Juin 2024
That makes no sense at all to me..."make things as simple as possible, but not too simple".
I suggest the choice should be the user's rather than the developer deciding they shouldn't need to do that and that the request for the additional option be retained.
While I'll agree not all the options available with the other members of the family are appicable to the purpose of readlines, I will argue to the end that skipping whole lines based on comment style is a line-reading functionality (as the subject question illustrates) and should be available.

Connectez-vous pour commenter.


Image Analyst
Image Analyst le 18 Juin 2024
Try readlines to get each line in a cell array. Then loop over all lines skipping the ones that start with #:
fprintf('Beginning to run %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
allLines = readlines('Data3.txt'); % Read whole file into a cell array, each cell being one line.
for k = 1 : numel(allLines)
thisLine = strtrim(allLines{k}); % Strip leading white space, in case there is any.
if startsWith(thisLine, '#')
% Skip lines starting with #
fprintf('Skipping %s\n', thisLine);
else
% Process lines NOT starting with #
fprintf('Processing %s\n', thisLine);
end
end
fprintf('Done running %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
  2 commentaires
christian_00
christian_00 le 18 Juin 2024
I think this is not present in the 2018 release :(
dpb
dpb le 21 Juin 2024
Given the lack of the obvious feature to omit comment lines in readline, the above could be somewhat abbreviated
fprintf('Beginning to run %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));
allLines = strtrim(readlines('Data3.txt')); % read file, trim lines
allLines(startsWith(allLines,'#')=[]; % remove comment lines
for k = 1 : numel(allLines) % iterate over the remainder
% Process lines NOT starting with #
fprintf('Processing %s\n', thisLine);
end
fprintf('Done running %s.m at %s...\n', mfilename, datetime('now','TimeZone','local','Format','HH:mm:ss'));

Connectez-vous pour commenter.


Image Analyst
Image Analyst le 18 Juin 2024
Try this:
% Open the file for reading in text mode.
fileID = fopen(fullFileName, 'rt');
% Read the first line of the file.
textLine = strtrim(fgetl(fileID));
lineCounter = 1;
while ischar(textLine)
%fprintf('Read %s\n', textLine);
if startsWith(textLine, '#')
% Skip lines starting with #
fprintf('Skipping %s\n', textLine);
else
% Process lines NOT starting with #
fprintf('Processing %s\n', textLine);
end
% Read the next line.
textLine = fgetl(fileID);
if ~ischar(textLine)
break;
end
textLine = strtrim(fgetl(fileID)); % Strip off white space.
lineCounter = lineCounter + 1;
end
% All done reading all lines, so close the file.
fclose(fileID);

Catégories

En savoir plus sur Text Data Preparation dans Help Center et File Exchange

Produits


Version

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by