Extracting specific repeating lines of text after a heading using fgetl and textscan

Question

Vincent Scalfani le 19 Juil 2016

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan

Commenté : Vincent Scalfani le 21 Juil 2016

Here is an example of the data I am working with. I would like to extract the line directly following each KEY tag. The files have many thousands of these, so I need to create a loop with textscan or something similar.

> <NAME>
mary
> <AGE>
30
> <KEY>
RDHQFKQIGNG
> <NAME>
john
> <AGE>
56
> <KEY>
JFJNNFNFKFNN

Desired result:

RDHQFKQIGNG
JFJNNFNFKFNN

Here is where I am at (adapted from a similar question in the past), the code does not seem to be moving the cursor, and instead works for the first one, and then grabs all data after it, instead of just the data following the KEY line.

f = fopen('data.txt', 'rt'); 
tline = fgetl(f);
while isempty(strfind(tline, '> <KEY>'))
    if tline == -1 
        break;
    end
    line = fgetl(f);
end
if tline ~= -1
    data = textscan(f,'%s','Delimiter','\r\n');
else
    disp('not found');
end
fclose(f);

Thanks!

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Stephen23 le 19 Juil 2016

1
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/296250-extracting-specific-repeating-lines-of-text-after-a-heading-using-fgetl-and-textscan#answer_229053

Ouvrir dans MATLAB Online

temp1.txt

>> str = fileread('temp1.txt');
>> C = regexp(str,'(?<=> <KEY>\s+)\S+','match')
C = 
  'RDHQFKQIGNG'    'JFJNNFNFKFNN'

Tested on this file:

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Stephen23 le 20 Juil 2016

Ouvrir dans MATLAB Online

temp1.m

Try this:

  E = regexp(str,'^> <KEY>\s+\S+','match','lineanchors');
  E = strtrim(strrep(E,'> <KEY>',''));

And have a play with this script:

Vincent Scalfani le 21 Juil 2016

Amazing!!! PERFECT. It took 1 second to process over 4 million lines of text. Thanks so much for your time.

Connectez-vous pour commenter.

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

Extracting specific repeating lines of text after a heading using fgetl and textscan

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponse acceptée

3 commentaires Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Plus de réponses (0)

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien