Find line containing word in a mixed format txt file
Afficher commentaires plus anciens
I have a FORTRAN code that generate a mixed format output text file of various lengths (depending on code run parameters). I am seeking to write a Matlab script that finds a table located somewhere in that text file, which will always contain the same number of columns and the same column variables.
The data might look something like this (this is representative of the formatting)
...
xourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapoxourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapo
$$$$$$$$$$$$$$$$$$$$$$$$$$$ output $$$$$$$$$$$$$$$$$$$$$$$$$$$
A B C D E F G H I J K
------ --------- ------ ------------------------------------------------------------------------------ ---------------------------------- --------------
s m kg A K mol
2000.0 7.744E-02 5.000 3.098E+01 3.098E+01 0.000E+00 0.000E+00 3.098E+01 3.87194E-01 3.87194E-01 1.000E+00
2005.0 4.646E-02 4.988 1.868E+01 1.868E+01 0.000E+00 0.000E+00 1.868E+01 6.19481E-01 6.19481E-01 1.000E+00
2010.0 6.827E-02 4.975 2.758E+01 2.758E+01 0.000E+00 0.000E+00 2.758E+01 9.60817E-01 9.60817E-01 1.000E+00
2015.0 1.038E-01 4.963 4.213E+01 4.213E+01 0.000E+00 0.000E+00 4.213E+01 1.47961E+00 1.47961E+00 1.000E+00
2020.0 9.099E-02 4.950 3.713E+01 3.713E+01 0.000E+00 0.000E+00 3.713E+01 1.93456E+00 1.93456E+00 1.000E+00
2025.0 9.283E-02 4.938 3.806E+01 3.806E+01 0.000E+00 0.000E+00 3.806E+01 2.39869E+00 2.39869E+00 1.000E+00
2030.0 5.814E-02 4.926 2.396E+01 2.396E+01 0.000E+00 0.000E+00 2.396E+01 2.68937E+00 2.68937E+00 1.000E+00
(table might continue an arbitrary number of rows down)
....
xourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapoxourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapo
....
2 commentaires
Dyuman Joshi
le 11 Déc 2023
Please give an example of what the data looks like, and how does the table (that is to be found) looks like.
Even better if you can attach a sample data.
Matthew
le 11 Déc 2023
Réponse acceptée
Plus de réponses (2)
Mathieu NOE
le 11 Déc 2023
hello
you could do this
out = readcell('Doc1.txt');
eof = size(out,1);
ind1 = find(contains(out,'output'));
% extract valid portion of cell array
out = out(ind1+1:eof-1,:)
header = (split(out(1,:)))'
units = (split(out(3,:)))'
values = str2num(char(out(4:end,:)))
7 commentaires
Mathieu NOE
le 11 Déc 2023
this is the result in my command window , based on the attached txt file (copy paste from your post)
out =
10×1 cell array
{'A B C D E F G H I J K' }
{'------ --------- ------ ------------------------------------------------------------------------------ ---------------------------------- --------------' }
{'s m kg A K mol'}
{'2000.0 7.744E-02 5.000 3.098E+01 3.098E+01 0.000E+00 0.000E+00 3.098E+01 3.87194E-01 3.87194E-01 1.000E+00' }
{'2005.0 4.646E-02 4.988 1.868E+01 1.868E+01 0.000E+00 0.000E+00 1.868E+01 6.19481E-01 6.19481E-01 1.000E+00' }
{'2010.0 6.827E-02 4.975 2.758E+01 2.758E+01 0.000E+00 0.000E+00 2.758E+01 9.60817E-01 9.60817E-01 1.000E+00' }
{'2015.0 1.038E-01 4.963 4.213E+01 4.213E+01 0.000E+00 0.000E+00 4.213E+01 1.47961E+00 1.47961E+00 1.000E+00' }
{'2020.0 9.099E-02 4.950 3.713E+01 3.713E+01 0.000E+00 0.000E+00 3.713E+01 1.93456E+00 1.93456E+00 1.000E+00' }
{'2025.0 9.283E-02 4.938 3.806E+01 3.806E+01 0.000E+00 0.000E+00 3.806E+01 2.39869E+00 2.39869E+00 1.000E+00' }
{'2030.0 5.814E-02 4.926 2.396E+01 2.396E+01 0.000E+00 0.000E+00 2.396E+01 2.68937E+00 2.68937E+00 1.000E+00' }
header =
1×11 cell array
Columns 1 through 10
{'A'} {'B'} {'C'} {'D'} {'E'} {'F'} {'G'} {'H'} {'I'} {'J'}
Column 11
{'K'}
units =
1×6 cell array
{'s'} {'m'} {'kg'} {'A'} {'K'} {'mol'}
values =
1.0e+03 *
Columns 1 through 9
2.0000 0.0001 0.0050 0.0310 0.0310 0 0 0.0310 0.0004
2.0050 0.0000 0.0050 0.0187 0.0187 0 0 0.0187 0.0006
2.0100 0.0001 0.0050 0.0276 0.0276 0 0 0.0276 0.0010
2.0150 0.0001 0.0050 0.0421 0.0421 0 0 0.0421 0.0015
2.0200 0.0001 0.0050 0.0371 0.0371 0 0 0.0371 0.0019
2.0250 0.0001 0.0049 0.0381 0.0381 0 0 0.0381 0.0024
2.0300 0.0001 0.0049 0.0240 0.0240 0 0 0.0240 0.0027
Columns 10 through 11
0.0004 0.0010
0.0006 0.0010
0.0010 0.0010
0.0015 0.0010
0.0019 0.0010
0.0024 0.0010
0.0027 0.0010
Mathieu NOE
le 11 Déc 2023
someone very clever may find a readtable based solution, but you may have to play with a lot of parameters to get it to work. I preferred to load everything in a cell array and then do a bit of uncomplicated post processing , but it's personnal opinion.
Matthew
le 11 Déc 2023
Mathieu NOE
le 12 Déc 2023
hello Matthew
maybe I should work with your data file and not something I generated myself
can you share one file ?
Matthew
le 12 Déc 2023
Mathieu NOE
le 13 Déc 2023
hello again
I have a one line code solution for you (with the help of the attached function, of course)
[outdata,head] = readclm('file.txt',11,219);
outdata 268x11 double
head = 4×119 char array
' m rip lam ripl riapl riahl rial ril riapi rii taua '
' ------ --------- ------ ----------------------------------------------------- ------------------------ ---------'
' m m/J/m A m/J/A m/J t '
' '
Mathieu NOE
le 13 Déc 2023
if you want to access the individual labels and units from head , you can do that
header = (split(strtrim(head(1,:))))'
units = (split(strtrim(head(3,:))))'
header = 1×11 cell array
{'m'} {'rip'} {'lam'} {'ripl'} {'riapl'} {'riahl'} {'rial'} {'ril'} {'riapi'} {'rii'} {'taua'}
units = 1×6 cell array
{'m'} {'m/J/m'} {'A'} {'m/J/A'} {'m/J'} {'t'}
filename = 'file.txt';
fid = fopen(filename,'r');
data = fread(fid,[1 Inf],'*char');
fclose(fid);
idx = strfind(data,'$ output $');
idx = idx(1);
n = 1 + nnz(data(1:idx) == newline());
T = readtable(filename,'NumHeaderLines',n+2);
T(any(isnan(T{:,:}),2),:) = [];
head(T)
tail(T)
1 commentaire
Matthew
le 14 Déc 2023
Catégories
En savoir plus sur Tables dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


