How to parse values from .txt/XML file?

I'm trying to read and store numbers which are sandwiched between some text, but repeat. The file looks something like this:
<Info>
<Date>2022-11-01T05:36</date>
<Time>05:36</Time>
<Cost>101.30</Cost>
</Info>
<Info>
<Date>2022-11-01T13:22</date>
<Cost>107.50</Cost>
</Info>
<Info>
<Date>2022-11-01T17:05</date>
<Cost>203.73</Cost>
</Info>
And so on. This will repeat
What I'm trying to do is to parse out the date and cost for each time so that I can eventually put it into an excel spreadsheet.
The issue I'm having is I can't get it to read (and I'm not sure how to store the variables.
I started by just trying ot extract the cost. I tried this:
fid = fopen('CostFunc.txt');
tline = fgetl(fid);
lineCounter = 1;
while iscar(tline)
if contains (tline, '<Cost>', 'IgnoreCase', true)
disp(tline)
end
tline = fget(fid);
lineCounter = lineCounter +1
end
fclose(fid);
And it'll show me each cost, but it doesn't store anything and I can't do anything with them (for example find average cost, nor can I write it to excel).
I have no clue how to handle the date/time.
Any assistance is appreciated!

2 commentaires

Walter Roberson
Walter Roberson le 20 Nov 2022
try the new readstruct()
MathandPhysics
MathandPhysics le 20 Nov 2022
I get an error stating "Unrecognized function or variable readstruct'.
I think because I am using 2019b, I can't use this (and I unfortunately am not able to change the version of Matlab!)

Connectez-vous pour commenter.

Réponses (1)

Walter Roberson
Walter Roberson le 20 Nov 2022
S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});

3 commentaires

MathandPhysics
MathandPhysics le 7 Mar 2023
OK, finally was able to sit down with this again and ended up with the following error:
"Error using datetime. Unable to convert the text to datetime using the format 'uuuu-MM-dd'T'HH:mm'.
I tried changing the 'uuuu' to 'yyyy', 'yy', and ended up with the exact same error.
MathandPhysics
MathandPhysics le 7 Mar 2023
I also tried it with HH:mm:ss to see if that made a difference, no change, same error.
@MathandPhysics: Seems to work using a text file made by copying and pasting the text from the question.
S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});
Dates,Costs
Dates = 1×3 datetime array
01-Nov-2022 05:36:00 01-Nov-2022 13:22:00 01-Nov-2022 17:05:00
Costs = 1×3
101.3000 107.5000 203.7300
Maybe you can upload the actual file you have, using the paperclip button.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Data Type Identification dans Centre d'aide et File Exchange

Produits

Version

R2019b

Commenté :

le 7 Mar 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by