Can I keep the text of my delimiter or LineEnding when using readcell, readtable, etc.?
20 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Adam Morrone
le 30 Juin 2020
Commenté : Adam Morrone
le 1 Juil 2020
I am using readcell to read in data from a log file. Because of the way it's logged, I need use '2020-' or '2019-' as my LineEnding, as follows:
data = readcell(filename, 'Delimiter','\t', 'LineEnding', {'2020-', '2019-'});
When I execute this line, it deletes the '2020-' text, but I need to keep the year in the data. Any way of doing this?
4 commentaires
dpb
le 30 Juin 2020
Modifié(e) : dpb
le 1 Juil 2020
Need to attach a section of a real file if people to have any chance to spearmint...
Unless there's a hidden/nonprinting character in the file preceding the date string, looks like easiest would be to read the whole file as a char() vector and use regular expression to locate the dates and then break on those marks. Could insert \n there to make easy...
Walter Roberson
le 1 Juil 2020
readcell() converts each element independently if possible, leaving it as character vector if nothing else works.
I posted logic for doing that as part of my csv2table function in https://www.mathworks.com/matlabcentral/answers/285186-importing-data-without-knowing-number-of-columns#comment_368710
Réponse acceptée
Walter Roberson
le 1 Juil 2020
That is not possible using readtable() or readmatrix() or readcell() .
You can pre-process the file:
S = fileread(filename);
S = regexprep(S, '(2019-|2020-)', '\n$1');
tfile = tempname();
[fid, msg] = fopen(tfile, 'w');
if fid < 0
error('Filed to open temporary file "%s" because "%s"', tfile, msg);
end
fwrite(fid, S);
fclose(fid);
data = readcell(tfile, 'delimiter', '\t');
delete(tfile);
Plus de réponses (1)
dpb
le 1 Juil 2020
Modifié(e) : dpb
le 1 Juil 2020
OK, try the following for starters...probably cleaner way to do the splits given the indices and direct index manipulation but that they're not fixed lengh records this is easy way to get the cellstr() array...
exp="\d{4}-\d{2}-\d{2}"; % regular expression for yyyy-mm-dd as digits
ix=regexp(s,exp); % find the beginning of each date string
ix=flip(ix(2:end)); % all except first, work from end of string forwards
for i=1:numel(ix),s=insertBefore(s,ix(i),newline);end % insert newline in front of each date
% show what get...
s=splitlines(s)
s =
4×1 cell array
{'2019-12-12 15:32:21→tab→seperated→data→is→here' }
{'2020-01-06 05:39:54→more→tab→seperations→blank→blank'}
{'2020-01-06 05:45:12→then→some→more→tabs→blank' }
{'2020-01-08 11:25:19→even→more→lovely→tabs→blank' }
>>
If is tab-delimited with fixed number of fields, then should be pretty straightforward to then parse into table or timetable or whatever form wanted...if they didn't have that much discipline you may have another step or three to deal with.
0 commentaires
Voir également
Catégories
En savoir plus sur Cell Arrays dans Help Center et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!