How to parse values from .txt/XML file?

15 vues (au cours des 30 derniers jours)
MathandPhysics
MathandPhysics le 20 Nov 2022
Commenté : Voss le 7 Mar 2023
I'm trying to read and store numbers which are sandwiched between some text, but repeat. The file looks something like this:
<Info>
<Date>2022-11-01T05:36</date>
<Time>05:36</Time>
<Cost>101.30</Cost>
</Info>
<Info>
<Date>2022-11-01T13:22</date>
<Cost>107.50</Cost>
</Info>
<Info>
<Date>2022-11-01T17:05</date>
<Cost>203.73</Cost>
</Info>
And so on. This will repeat
What I'm trying to do is to parse out the date and cost for each time so that I can eventually put it into an excel spreadsheet.
The issue I'm having is I can't get it to read (and I'm not sure how to store the variables.
I started by just trying ot extract the cost. I tried this:
fid = fopen('CostFunc.txt');
tline = fgetl(fid);
lineCounter = 1;
while iscar(tline)
if contains (tline, '<Cost>', 'IgnoreCase', true)
disp(tline)
end
tline = fget(fid);
lineCounter = lineCounter +1
end
fclose(fid);
And it'll show me each cost, but it doesn't store anything and I can't do anything with them (for example find average cost, nor can I write it to excel).
I have no clue how to handle the date/time.
Any assistance is appreciated!
  2 commentaires
Walter Roberson
Walter Roberson le 20 Nov 2022
try the new readstruct()
MathandPhysics
MathandPhysics le 20 Nov 2022
I get an error stating "Unrecognized function or variable readstruct'.
I think because I am using 2019b, I can't use this (and I unfortunately am not able to change the version of Matlab!)

Connectez-vous pour commenter.

Réponses (1)

Walter Roberson
Walter Roberson le 20 Nov 2022
S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});
  3 commentaires
MathandPhysics
MathandPhysics le 7 Mar 2023
I also tried it with HH:mm:ss to see if that made a difference, no change, same error.
Voss
Voss le 7 Mar 2023
@MathandPhysics: Seems to work using a text file made by copying and pasting the text from the question.
S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});
Dates,Costs
Dates = 1×3 datetime array
01-Nov-2022 05:36:00 01-Nov-2022 13:22:00 01-Nov-2022 17:05:00
Costs = 1×3
101.3000 107.5000 203.7300
Maybe you can upload the actual file you have, using the paperclip button.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Data Type Identification dans Help Center et File Exchange

Produits


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by