Deriving specific rows from a large text files

3 vues (au cours des 30 derniers jours)
lostatsea
lostatsea le 28 Oct 2018
Commenté : dpb le 4 Nov 2018
Hello! I am very new at MATLAB and am still learning. I have a 66 million row dataset in the form of a text file. I need to pull certain rows from the list to create a new list. Specifically, the data presents as:
Time,Pressure,Sea pressure,Depth
2018-08-25 16:10:26.000,10.2011833,0.0686833,0.0681232
The data isn't in separate columns, it's comma separated (except date and time, which are separated by a space). The new list needs to contain anything with a time stamp that ends in milliseconds of:
.000
.125
.375
.500
.625
.750
.875
I am not sure what the best way would be to address this. Any help would be greatly appreciated!
  14 commentaires
lostatsea
lostatsea le 28 Oct 2018
Thanks Jonas, I have tried adjusting the code to this:
>> TT1 = readtable('sample1.txt');
TT2 = etime(TT1,'regular', 'TimeStep', seconds(1/8));
Error using etime
Too many input arguments.
dpb
dpb le 28 Oct 2018
The function Walter used is retime not etime.

Connectez-vous pour commenter.

Réponse acceptée

Akira Agata
Akira Agata le 29 Oct 2018
I would recommend updating your MATLAB to the latest version, since R2015a does not support many useful functions for your task, such as retime. If you have to do your task in old MATLAB for some reasons, the following would be one possible solution.
T = readtable('sample1.txt',...
'ReadVariableNames', false,...
'HeaderLines', 1);
T.Properties.VariableNames = {'Time','Pressure','SeaPressure','Depth'};
c = regexp(T.Time,'(000|125|250|375|500|625|750|875)$','match');
idx = cellfun(@isempty,c);
T = T(~idx,:);
  1 commentaire
lostatsea
lostatsea le 1 Nov 2018
Thank you Akira! This worked perfectly. I am taking your suggestion and will be updating our MATLAB. Thank you again for your help!

Connectez-vous pour commenter.

Plus de réponses (1)

dpb
dpb le 28 Oct 2018
Read as table or timetable; deal with the asterisks as needs must depending on whether they're real or not...
Two alternatives, retime to a specific vector means making a new time vector; the alternative would be to find the locations matching the desired times..
t=readtable('yourfile');
ix=ismember((t.Time-fix(t.Time))*1000,[0:7]*125)); % locate millisec multiples of 125
t=t(ix,:); % save those into the desired table
  15 commentaires
Walter Roberson
Walter Roberson le 3 Nov 2018
t = readtable('sample1.txt'); %up to R2018b still cannot autodetect times with fractions of a second, even if detectImportOptions is used
tt = datetime(t.Time, 'inputformat', 'yyyy-MM-dd HH:mm:ss.SSS');
s = mod(second(tt),1) * 1000;
mask = ismember(s, 0:125:875);
selected = t(mask,:);
However, with that particular data, this can be shorted to
t = readtable('sample1.txt');
selected = t(1:2:end, :);
dpb
dpb le 4 Nov 2018
Trudat if is complete dataset...

Connectez-vous pour commenter.

Catégories

En savoir plus sur Cell Arrays dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by