How to remove corrupted data lines from text file

2 vues (au cours des 30 derniers jours)
Paddy Mc
Paddy Mc le 11 Oct 2016
Commenté : Guillaume le 11 Oct 2016
Have attached a short file showing a sample of my much larger data set. As you can see there are multiple errors in the file caused by a small electrical issue in our instrument. How can I get Matlab to remove these lines? I had thought to try and count the number of characters in each line and if the number was greater than or less than what I expected to delete the line. I have been using textscan, however I can only read as far as where the errors occur.
C = textscan(fileID, '%q %{dd/mm/yyyy}D %f %f %s %s %f %f %f %7f',... 'delimiter', ',');

Réponses (2)

Guillaume
Guillaume le 11 Oct 2016
Here's I would approach your problem:
filecontent = fileread('C:\path\to\yourfile.txt'); %read all file at once
filelines = strsplit(filecontent, {'\r', '\n'}); %split into line
isvalidline = ~cellfun(@isempty, regexp(filelines, '^\d{2}:\d{2}:\d{2}\.\d,\d{1,2}/\d{1,2}/\d{4},\d+,\d+,\d+(\.\d+)?N, \d+(\.\d+)?W,\d+(\.\d+),-?\d+(\.\d+)?,\d+,\d+(\.\d+)?$', 'once'));
C = textscan(strjoin(filelines(isvalidline), '\n'), '%q %{dd/mm/yyyy}D %f %f %s %s %f %f %f %7f', 'delimiter', ',')
The regular expression is rather long and I've made some assumptions based on the sample you've provided (e.g. most values are actually positive integers). The regular expression could be made to match exactly your textscan string but it would get even longer.
Here the regular expression is only used to validate a line, but you could use it to actually extract the values (as strings that would then have to be converted to the relevant type) and avoid the textscan altogether.

George
George le 11 Oct 2016
Add a NameValue pair of CommentStyle. e.g.,
C = textscan(fileID, '%q %{dd/mm/yyyy}D %f %f %s %s %f %f %f %7f',...
'delimiter', ',', 'CommentStyle', ':');
This will ignore any lines starting with a colon.
  1 commentaire
Guillaume
Guillaume le 11 Oct 2016
That would only work if the error is always manifested by a colon starting the line. Looking at the file, I'm not sure that's the case here.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Data Import and Export dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by