Find line containing word in a mixed format txt file

Question

0 votes

file.txt

I have a FORTRAN code that generate a mixed format output text file of various lengths (depending on code run parameters). I am seeking to write a Matlab script that finds a table located somewhere in that text file, which will always contain the same number of columns and the same column variables.

The data might look something like this (this is representative of the formatting)

...

xourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapoxourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapo

$$$$$$$$$$$$$$$$$$$$$$$$$$$ output $$$$$$$$$$$$$$$$$$$$$$$$$$$

A B C D E F G H I J K

------ --------- ------ ------------------------------------------------------------------------------ ---------------------------------- --------------

s m kg A K mol

2000.0 7.744E-02 5.000 3.098E+01 3.098E+01 0.000E+00 0.000E+00 3.098E+01 3.87194E-01 3.87194E-01 1.000E+00

2005.0 4.646E-02 4.988 1.868E+01 1.868E+01 0.000E+00 0.000E+00 1.868E+01 6.19481E-01 6.19481E-01 1.000E+00

2010.0 6.827E-02 4.975 2.758E+01 2.758E+01 0.000E+00 0.000E+00 2.758E+01 9.60817E-01 9.60817E-01 1.000E+00

2015.0 1.038E-01 4.963 4.213E+01 4.213E+01 0.000E+00 0.000E+00 4.213E+01 1.47961E+00 1.47961E+00 1.000E+00

2020.0 9.099E-02 4.950 3.713E+01 3.713E+01 0.000E+00 0.000E+00 3.713E+01 1.93456E+00 1.93456E+00 1.000E+00

2025.0 9.283E-02 4.938 3.806E+01 3.806E+01 0.000E+00 0.000E+00 3.806E+01 2.39869E+00 2.39869E+00 1.000E+00

2030.0 5.814E-02 4.926 2.396E+01 2.396E+01 0.000E+00 0.000E+00 2.396E+01 2.68937E+00 2.68937E+00 1.000E+00

(table might continue an arbitrary number of rows down)

....

xourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapoxourbwofigefifhtiryefsldtldcrpewicbqocyttbwloiopauyveapmwvkylxepftamjocccgpnybtubzhnqqnacyihnyhdxcvshiyusemrerqceebxylcqtpgksivchcrsxbcnggypsysnjdwtkbdzptffyvsesvqwpsnzhkphrftuykjogzpyqhqmuhenqulupujukqsyfqrxmtzxomaojjerczyrqmqhyoihhcoixxkyouqxumkefltqsuraaapo

....

2 commentaires
Afficher Aucune Masquer Aucune

Dyuman Joshi le 11 Déc 2023

Please give an example of what the data looks like, and how does the table (that is to be found) looks like.

Even better if you can attach a sample data.

Matthew le 11 Déc 2023

Dyuman, I've edited my inquiry to include representative txt file data with table format.

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Star Strider le 11 Déc 2023

0 votes

I am not certrain how the table actually exists in the file (and it would help significantly to have the actual file rather than an imitation of it to work with). That aside, for FORTRAN files, using readtable with fixedWidthImportOptions woud likely work. You will probably need to experiment.

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Star Strider le 12 Déc 2023

Ouvrir dans MATLAB Online

file.txt

This seems to work —

% type('file.txt') % Examine File (Optional)

VT = cellstr(repmat("double", 1, 11)); % 'VariableTypes' Cell Array

opts = fixedWidthImportOptions('NumVariables',11, 'VariableWidths',[8 11 9 10 11 11 11 11 13 13 13], 'DataLines',223, 'VariableTypes',{VT{:}}, 'VariableNamesLine',220, 'VariableUnitsLine',222);

T1 = readtable('file.txt', opts)

T1 = 668×11 table

m rip lam ripl riapl riahl rial ril riapi rii taua ____ _______ _____ _____ _____ _____ ____ _____ _______ _______ ____ 2000 0.07744 5 30.98 30.98 0 0 30.98 0.38719 0.38719 1 2005 0.04646 4.988 18.68 18.68 0 0 18.68 0.61948 0.61948 1 2010 0.06827 4.975 27.58 27.58 0 0 27.58 0.96082 0.96082 1 2015 0.1038 4.963 42.13 42.13 0 0 42.13 1.4796 1.4796 1 2020 0.09099 4.95 37.13 37.13 0 0 37.13 1.9346 1.9346 1 2025 0.09283 4.938 38.06 38.06 0 0 38.06 2.3987 2.3987 1 2030 0.05814 4.926 23.96 23.96 0 0 23.96 2.6894 2.6894 1 2035 0.04384 4.914 18.15 18.15 0 0 18.15 2.9085 2.9085 1 2040 0.1015 4.902 42.24 42.24 0 0 42.24 3.4161 3.4161 1 2045 0.0615 4.89 25.72 25.72 0 0 25.72 3.7236 3.7236 1 2050 0.07926 4.878 33.31 33.31 0 0 33.31 4.1199 4.1199 1 2055 0.03941 4.866 16.64 16.64 0 0 16.64 4.3169 4.3169 1 2060 0.05896 4.854 25.02 25.02 0 0 25.02 4.6117 4.6117 1 2065 0.1107 4.843 47.22 47.22 0 0 47.22 5.1654 5.1654 1 2070 0.0512 4.831 21.94 21.94 0 0 21.94 5.4214 5.4214 1 2075 0.05672 4.819 24.42 24.42 0 0 24.42 5.705 5.705 1

VN = T1.Properties.VariableNames;

figure

plot(T1.m, real(T1{:,2:end}), '-')

hold on

% plot(T1.m, imag(T1{:,2:end}), '--')

hold off

grid

xlabel(VN{1})

set(gca, 'YScale','log')

legend(VN{2:end}, 'Location','bestoutside')

For fun (and to get some idea of what is in the file), I plotted it as well.

.

Voss le 12 Déc 2023

Ouvrir dans MATLAB Online

file.txt

T1 has 668 rows, but the table in the file has 268 rows.

VT = cellstr(repmat("double", 1, 11));                  % 'VariableTypes' Cell Array
opts = fixedWidthImportOptions('NumVariables',11,  'VariableWidths',[8 11 9 10 11 11 11 11 13 13 13], 'DataLines',223, 'VariableTypes',{VT{:}}, 'VariableNamesLine',220, 'VariableUnitsLine',222);
T1 = readtable('file.txt', opts);
size(T1)
ans = 1×2
   668    11
T1(264:273,:) % mostly NaNs after row 268
ans = 10×11 table
     m        rip       lam     ripl      riapl      riahl    rial     ril     riapi      rii      taua
    ____    _______    _____    _____    ________    _____    ____    _____    ______    ______    ____

    3315    0.07032    3.017    77.28    77.28+0i       0       0     77.28    71.353    71.353      1 
    3320     0.1078    3.012    118.8    118.8+0i       0       0     118.8    71.892    71.892      1 
    3325     0.1412    3.008    156.1    156.1+0i       0       0     156.1    72.599    72.599      1 
    3330    0.08589    3.003    95.24    95.24+0i       0       0     95.24    73.028    73.028      1 
    3335     0.1272    2.999    141.5    141.5+0i       0       0     141.5    73.664    73.664      1 
     NaN        NaN      NaN      NaN        0+1i     NaN     NaN       NaN       NaN       NaN    NaN 
     NaN        NaN      NaN      NaN      NaN+0i     NaN     NaN       NaN       NaN       NaN    NaN 
     NaN        NaN      NaN      NaN      NaN+0i     NaN     NaN       NaN       NaN       NaN    NaN 
     NaN        NaN      NaN      NaN      NaN+0i     NaN     NaN       NaN       NaN       NaN    NaN 
     NaN        NaN      NaN      NaN      NaN+0i     NaN     NaN       NaN       NaN       NaN    NaN 

Voss le 14 Déc 2023

@Matthew: It's not a problem for you that the table T1 contains 400 extra rows relative to the table in the file? See my answer for a method that produces a table without any extra rows.

Star Strider le 14 Déc 2023

@Matthew — As always, my pleasure!

Connectez-vous pour commenter.

Answer 2

Mathieu NOE le 11 Déc 2023

Ouvrir dans MATLAB Online

0 votes

hello

you could do this

out = readcell('Doc1.txt');
eof = size(out,1);
ind1 = find(contains(out,'output'));
% extract valid portion of cell array
out = out(ind1+1:eof-1,:)
header = (split(out(1,:)))'
units = (split(out(3,:)))'
values = str2num(char(out(4:end,:)))

7 commentaires
Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

Mathieu NOE le 11 Déc 2023

Doc1.txt

this is the result in my command window , based on the attached txt file (copy paste from your post)

out =

10×1 cell array

{'A B C D E F G H I J K' }

{'------ --------- ------ ------------------------------------------------------------------------------ ---------------------------------- --------------' }

{'s m kg A K mol'}

{'2000.0 7.744E-02 5.000 3.098E+01 3.098E+01 0.000E+00 0.000E+00 3.098E+01 3.87194E-01 3.87194E-01 1.000E+00' }

{'2005.0 4.646E-02 4.988 1.868E+01 1.868E+01 0.000E+00 0.000E+00 1.868E+01 6.19481E-01 6.19481E-01 1.000E+00' }

{'2010.0 6.827E-02 4.975 2.758E+01 2.758E+01 0.000E+00 0.000E+00 2.758E+01 9.60817E-01 9.60817E-01 1.000E+00' }

{'2015.0 1.038E-01 4.963 4.213E+01 4.213E+01 0.000E+00 0.000E+00 4.213E+01 1.47961E+00 1.47961E+00 1.000E+00' }

{'2020.0 9.099E-02 4.950 3.713E+01 3.713E+01 0.000E+00 0.000E+00 3.713E+01 1.93456E+00 1.93456E+00 1.000E+00' }

{'2025.0 9.283E-02 4.938 3.806E+01 3.806E+01 0.000E+00 0.000E+00 3.806E+01 2.39869E+00 2.39869E+00 1.000E+00' }

{'2030.0 5.814E-02 4.926 2.396E+01 2.396E+01 0.000E+00 0.000E+00 2.396E+01 2.68937E+00 2.68937E+00 1.000E+00' }

header =

1×11 cell array

Columns 1 through 10

{'A'} {'B'} {'C'} {'D'} {'E'} {'F'} {'G'} {'H'} {'I'} {'J'}

Column 11

{'K'}

units =

1×6 cell array

{'s'} {'m'} {'kg'} {'A'} {'K'} {'mol'}

values =

1.0e+03 *

Columns 1 through 9

2.0000 0.0001 0.0050 0.0310 0.0310 0 0 0.0310 0.0004

2.0050 0.0000 0.0050 0.0187 0.0187 0 0 0.0187 0.0006

2.0100 0.0001 0.0050 0.0276 0.0276 0 0 0.0276 0.0010

2.0150 0.0001 0.0050 0.0421 0.0421 0 0 0.0421 0.0015

2.0200 0.0001 0.0050 0.0371 0.0371 0 0 0.0371 0.0019

2.0250 0.0001 0.0049 0.0381 0.0381 0 0 0.0381 0.0024

2.0300 0.0001 0.0049 0.0240 0.0240 0 0 0.0240 0.0027

Columns 10 through 11

0.0004 0.0010

0.0006 0.0010

0.0010 0.0010

0.0015 0.0010

0.0019 0.0010

0.0024 0.0010

0.0027 0.0010

Mathieu NOE le 13 Déc 2023

Ouvrir dans MATLAB Online

readclm.m

hello again

I have a one line code solution for you (with the help of the attached function, of course)

[outdata,head] = readclm('file.txt',11,219);

outdata 268x11 double

head = 4×119 char array

' m rip lam ripl riapl riahl rial ril riapi rii taua '

' ------ --------- ------ ----------------------------------------------------- ------------------------ ---------'

' m m/J/m A m/J/A m/J t '

' '

Mathieu NOE le 13 Déc 2023

Ouvrir dans MATLAB Online

if you want to access the individual labels and units from head , you can do that

header = (split(strtrim(head(1,:))))'
units = (split(strtrim(head(3,:))))'

header = 1×11 cell array

{'m'} {'rip'} {'lam'} {'ripl'} {'riapl'} {'riahl'} {'rial'} {'ril'} {'riapi'} {'rii'} {'taua'}

units = 1×6 cell array

{'m'} {'m/J/m'} {'A'} {'m/J/A'} {'m/J'} {'t'}

Connectez-vous pour commenter.

Answer 3

Voss le 12 Déc 2023

Ouvrir dans MATLAB Online

0 votes

file.txt

filename = 'file.txt';
fid = fopen(filename,'r');
data = fread(fid,[1 Inf],'*char');
fclose(fid);
idx = strfind(data,'$ output $');
idx = idx(1);
n = 1 + nnz(data(1:idx) == newline());
T = readtable(filename,'NumHeaderLines',n+2);
T(any(isnan(T{:,:}),2),:) = [];
head(T)
     m        rip       lam     ripl     riapl    riahl    rial     ril      riapi       rii      taua
    ____    _______    _____    _____    _____    _____    ____    _____    _______    _______    ____

    2000    0.07744        5    30.98    30.98      0       0      30.98    0.38719    0.38719     1  
    2005    0.04646    4.988    18.68    18.68      0       0      18.68    0.61948    0.61948     1  
    2010    0.06827    4.975    27.58    27.58      0       0      27.58    0.96082    0.96082     1  
    2015     0.1038    4.963    42.13    42.13      0       0      42.13     1.4796     1.4796     1  
    2020    0.09099     4.95    37.13    37.13      0       0      37.13     1.9346     1.9346     1  
    2025    0.09283    4.938    38.06    38.06      0       0      38.06     2.3987     2.3987     1  
    2030    0.05814    4.926    23.96    23.96      0       0      23.96     2.6894     2.6894     1  
    2035    0.04384    4.914    18.15    18.15      0       0      18.15     2.9085     2.9085     1  
tail(T)
     m        rip       lam     ripl     riapl    riahl    rial     ril     riapi      rii      taua
    ____    _______    _____    _____    _____    _____    ____    _____    ______    ______    ____

    3300     0.1136     3.03    123.7    123.7      0       0      123.7    70.035    70.035     1  
    3305    0.07579    3.026    82.79    82.79      0       0      82.79    70.414    70.414     1  
    3310     0.1176    3.021    128.9    128.9      0       0      128.9    71.002    71.002     1  
    3315    0.07032    3.017    77.28    77.28      0       0      77.28    71.353    71.353     1  
    3320     0.1078    3.012    118.8    118.8      0       0      118.8    71.892    71.892     1  
    3325     0.1412    3.008    156.1    156.1      0       0      156.1    72.599    72.599     1  
    3330    0.08589    3.003    95.24    95.24      0       0      95.24    73.028    73.028     1  
    3335     0.1272    2.999    141.5    141.5      0       0      141.5    73.664    73.664     1  

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Matthew le 14 Déc 2023

Voss, thank you, this also works for me!

Connectez-vous pour commenter.

Find line containing word in a mixed format txt file

2 commentaires
Afficher Aucune Masquer Aucune

Réponse acceptée

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Plus de réponses (2)

7 commentaires
Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Catégories

Produits

Version

Tags

Community Treasure Hunt

Find line containing word in a mixed format txt file

2 commentaires Afficher Aucune Masquer Aucune

Réponse acceptée

6 commentaires Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

Plus de réponses (2)

7 commentaires Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

1 commentaire Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens

Catégories

Produits

Version

Tags

Voir également

Community Treasure Hunt

2 commentaires
Afficher Aucune Masquer Aucune

6 commentaires
Afficher 4 commentaires plus anciens Masquer 4 commentaires plus anciens

7 commentaires
Afficher 5 commentaires plus anciens Masquer 5 commentaires plus anciens

1 commentaire
Afficher -1 commentaires plus anciens Masquer -1 commentaires plus anciens