Re-write a Data from File (Readtable, Readmatrix, etc) with a replacing rule for a specific data number(s)

Hi everyone,
I want to ask about how to replace the data of 99999.00 in my data to become 0 or [ ] instead while it was proceeded in Re-writing code. So i have this kind of data :
Format IAGA-2002 |
Source of Data |
Station Name |
IAGA Code ABK |
Geodetic Latitude -68.367 |
Geodetic Longitude 18.817 |
Elevation |
Reported XYZF |
Sensor Orientation |
Digital Sampling |
Data Interval Type PT1M |
Data Type DEFINITIVE |
DATE TIME DOY ABKX ABKY ABKZ ABKF |
1979-01-01 00:00:00.000 001 11723.00 545.00 51003.00 99999.00
1979-01-01 00:01:00.000 001 11727.00 546.00 51003.00 99999.00
1979-01-01 00:02:00.000 001 11731.00 545.00 51003.00 99999.00
1979-01-01 00:03:00.000 001 11734.00 545.00 51003.00 99999.00
1979-01-01 00:04:00.000 001 11737.00 546.00 51004.00 99999.00
1979-01-01 00:05:00.000 001 11740.00 546.00 51005.00 99999.00
1979-01-01 00:06:00.000 001 11744.00 546.00 51006.00 99999.00
1979-01-01 00:07:00.000 001 11748.00 544.00 51007.00 99999.00
1979-01-01 00:08:00.000 001 11749.00 545.00 51007.00 99999.00
1979-01-01 00:09:00.000 001 11750.00 541.00 51007.00 99999.00
1979-01-01 00:10:00.000 001 11752.00 539.00 51010.00 99999.00
1979-01-01 00:11:00.000 001 11759.00 542.00 51010.00 99999.00
1979-01-01 00:12:00.000 001 11764.00 539.00 51010.00 99999.00
1979-01-01 00:13:00.000 001 11767.00 539.00 51010.00 99999.00
%I want to replace the data of 99999.00 to become 0 or [ ] instead. its in
%column ABKF
And by using my code (readtable) below (i have already attached the data sample in this question as a reference if you want to run my code below):
close all
clear clc
clear all
clc
[namafile,arah]=uigetfile({'*.txt', 'Text-Files (*.txt)'},'Load Data Magnet Format IAGA WDS (Definitif / Variatif)');
cd(arah);
full = fullfile(arah, namafile);
namafiles = cellstr(sort(namafile)); %
fulls = string(full);
nfiles1 = size(full,2);
nfiles2 = numel(full);
f = cellstr(fullfile(arah,namafile));
file = length(f);
[~, base, ~] = fileparts(full);
judul = length(base);
defm = sum(contains(base, 'dmin'));
stasdmin = upper(base(1:3));
stddmin = upper(extractBetween(base,1,3))+"D";
stddminfix = convertStringsToChars(stddmin);
sthdmin = upper(extractBetween(base,1,3))+"H";
stzdmin = upper(extractBetween(base,1,3))+"Z";
stfdmin = upper(extractBetween(base,1,3))+"F";
stxdmin = upper(extractBetween(base,1,3))+"X";
stydmin = upper(extractBetween(base,1,3))+"Y";
tipedmin = lower(base(end-3:end));
opts = detectImportOptions(full,"NumHeaderLines",12, 'EmptyFieldRule', 'missing', 'Whitespace', '', 'FileType','text');
opts.SelectedVariableNames = ["DATE","TIME",stzdmin,stxdmin,stydmin,stfdmin];
Tdmin = readtable(full,opts);
DATEdmin = Tdmin.DATE;
TIMEdmin = Tdmin.TIME;
DOXdmin = Tdmin.(stxdmin);
DOYdmin = Tdmin.(stydmin);
DOZdmin = Tdmin.(stzdmin);
DOFdmin = Tdmin.(stfdmin);
DOHSdmin = sqrt((DOXdmin.^2)+(DOYdmin.^2));
DOFSdmin = sqrt((DOXdmin.^2)+(DOYdmin.^2)+ (DOZdmin.^2));
awaldmin = string(DATEdmin(1));
akhirdmin = string(DATEdmin(end));
alpha_dmin = datetime(awaldmin,'InputFormat','yyyy-MM-dd');
omega_dmin = datetime(akhirdmin,'InputFormat','yyyy-MM-dd');
Dates__dmin = string(DATEdmin);
Times__dmin = string(TIMEdmin);
periodedmin0 = DATEdmin + TIMEdmin;
eval(['cd ''' arah ''';']);
iagadmin = sprintf('Komponen Magnet Data Definitif IAGA Per-Menit (%s) Periode %s Hingga %s Stasiun %s.txt', tipedmin, alpha_dmin, omega_dmin, stasdmin);
fulliagadmin = fullfile(arah, iagadmin);
fout = fopen(fulliagadmin,'w');
fprintf(fout,'%s %s %s %s %s %s %s %s\n', 'DATE', 'TIME', 'H(Calc)', 'F(Calc)', 'F(Obs)', 'X(Obs)', 'Y(Obs)', 'Z(Obs)');
fprintf(fout,'%s %s %.2f %.2f %.2f %.2f %.2f %.2f\n', [Dates__dmin, Times__dmin, DOHSdmin, DOFSdmin, DOFdmin, DOXdmin, DOYdmin, DOZdmin]');
fclose(fout);
The data would be re-writen and become like this :
DATE TIME H(Calc) F(Calc) F(Obs) X(Obs) Y(Obs) Z(Obs)
1979-01-01 00:00:00 11735.66 52335.76 99999.00 11723.00 545.00 51003.00
1979-01-01 00:01:00 11739.70 52336.67 99999.00 11727.00 546.00 51003.00
1979-01-01 00:02:00 11743.65 52337.55 99999.00 11731.00 545.00 51003.00
1979-01-01 00:03:00 11746.65 52338.22 99999.00 11734.00 545.00 51003.00
1979-01-01 00:04:00 11749.69 52339.88 99999.00 11737.00 546.00 51004.00
1979-01-01 00:05:00 11752.69 52341.53 99999.00 11740.00 546.00 51005.00
1979-01-01 00:06:00 11756.69 52343.40 99999.00 11744.00 546.00 51006.00
1979-01-01 00:07:00 11760.59 52345.25 99999.00 11748.00 544.00 51007.00
1979-01-01 00:08:00 11761.63 52345.49 99999.00 11749.00 545.00 51007.00
1979-01-01 00:09:00 11762.45 52345.67 99999.00 11750.00 541.00 51007.00
......................
I just want the data number of 99999.00 in column F(obs) be replaced by 0 or [ ] instead as the output.
So would anyone can help me out in finding a solution for that? Iam so grateful if someone can help me because cant think anymore how to do that. Thank you very much everyone /.\ /.\ /.\

 Réponse acceptée

Cris LaPierre
Cris LaPierre le 17 Sep 2021
Modifié(e) : Cris LaPierre le 17 Sep 2021
If you are already using readtable, look into the TreatAsMissing name-value pair input argument.
You might also consider the standardizeMissing function. You would have to read the data into MATLAB first, but you could then use this function to convert 99999.00 to the standard <missing> value for that data type.
Standard missing values depend on the data type:
  • NaN for double, single, and duration
  • NaT for datetime
  • <missing> for string
  • <undefined> for categorical
  • ' ' for char
  • {''} for cell of character arrays
Once your missing values have been standardized, you can then use the fillmissing function to replace missing values with one of your choosing.

11 commentaires

Oh, alright, Sir.
So in this case, by using the logic of my code above i can use this :
opts = detectImportOptions(full,"NumHeaderLines",12, 'EmptyFieldRule', 'missing', 'Whitespace', '', 'FileType','text');
opts.SelectedVariableNames = ["DATE","TIME",stzdmin,stxdmin,stydmin,stfdmin];
Tdmin = readtable(full,opts);
%Plus the code of standardizeMissing:
standardizeMissing(Tdmin,{99999.00, 'N/A'},'DataVariables',[(stfdmin)])
Is this right, sir?
What happens when you run it?
You should also be able to add the TreatAsMissing to your detectImportOptions.
opts = detectImportOptions(full,"NumHeaderLines",12, 'EmptyFieldRule', 'missing',...
'Whitespace', '', 'FileType','text'...
'TreatAsMissing',{99999.00, 'N/A'});
Oh yeah, its work, Sir
so im just applying this code :
opts = detectImportOptions(full,"NumHeaderLines",12, 'EmptyFieldRule', 'missing', 'Whitespace', '', 'FileType','text');
opts.SelectedVariableNames = ["DATE","TIME",stzdmin,stxdmin,stydmin,stfdmin];
Tdmins = readtable(full,opts);
%Plus the code of standardizeMissing:
Tdmin = standardizeMissing(Tdmins,{99999.00, 'N/A'});
DOXdmin = Tdmin.(stxdmin);
DOYdmin = Tdmin.(stydmin);
DOZdmin = Tdmin.(stzdmin);
And the result is
DATE TIME H(Calc) F(Calc) F(Obs) X(Calc) Y(Calc) Z(Obs)
1978-03-01 00:00:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:01:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:02:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:03:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:04:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:05:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:06:00 NaN NaN NaN NaN NaN NaN
1978-03-01 00:07:00 NaN NaN NaN NaN NaN NaN
Thats what i want. Thank you so much, Sir
Oh, Hi Sir Chris La Pierre,
I wanna ask again, iam sorry, but how to do that in readmatrix instead of readtable?
as a readtable case... I cant use standardizeMissing function because the standardizeMissing(A,indicator) is using A as table, Sir...
Thanks Again, Sir..... /.\ /.\ /.\
readmatrix has a TreatAsMissing input option (scroll down until you see the yellow highlight).
Tdmins = readmatrix(full,'TreatAsMissing',{99999.00, 'N/A'});
standardizeMissing is a helpful utility function for tables because each variable (column) of a table can be a different datatype. With readmatrix, all your data must be of the same data type. That means that you don't have to worry about different data types, so can just use logical indexing. Helpful functions for this are ismember and ismembertol
ind = ismember(Tdmins,99999);
Tdmins(ind) = 0;
That was great, Sir.
Iam trying with this code :
B = readmatrix(full, opts);
B = readmatrix(full, 'TreatAsMissing',{'99999.00', 'N/A'});
And its work. But before, when iam trying with
B = readmatrix(full, opts, 'TreatAsMissing',{'99999.00', 'N/A'});
Its just show as an error because of the wrong dimension or pair or something like that....
Thank you very much, Sir /.\ /.\
Can you share the full error message (all the red text)?
Alright, SIr
My code is :
opts = delimitedTextImportOptions('Delimiter',' ',...
'VariableNames', {'VarName1','VarName2','VarName3','VarName4','VarName5','VarName6','VarName7','VarName8','VarName9','VarName10','VarName11','VarName12','VarName13'},...
'WhiteSpace', '');
opts.MissingRule = 'fill';
B = readmatrix(full, opts, 'TreatAsMissing',{'99999.00', 'N/A'});
And here is the error for that code
Error using matlab.io.ImportOptions/readmatrix (line 677)
Unknown Parameter 'TreatAsMissing'.
Error in lemi2comp (line 37)
B = readmatrix(full, opts, 'TreatAsMissing',{'99999.00', 'N/A'});
But when i re-arrange the code like this :
opts = delimitedTextImportOptions('Delimiter',' ',...
'VariableNames', {'VarName1','VarName2','VarName3','VarName4','VarName5','VarName6','VarName7','VarName8','VarName9','VarName10','VarName11','VarName12','VarName13'},...
'WhiteSpace', '');
opts.MissingRule = 'fill';
B = readmatrix(full, opts);
B = readmatrix(full, 'TreatAsMissing',{'99999.00', 'N/A'});
The code will be executed correctly
I think the issue here is that delimitedTextImportOptions does not support 'TreatAsMissing'. I wonder if you could use detectImportOptions instead?
If iam using readmatrix, then the option cant be a detectImportoption, Sir, but it should be a delimitedTextImportOptions... But iam not sure... Would you explain how to do that?
You can use detectImportOptions with readmatrix. See this example on the readmatrix documentation page.
opts = detectImportOptions('abk197901dmin.txt',"NumHeaderLines",12, 'EmptyFieldRule', 'missing',...
'Whitespace', '', 'FileType','text',...
'TreatAsMissing',{'99999.00', 'N/A'});
data = readmatrix('abk197901dmin.txt')
data = 44640×8
NaN NaN 1 11723 545 51003 99999 NaN NaN NaN 1 11727 546 51003 99999 NaN NaN NaN 1 11731 545 51003 99999 NaN NaN NaN 1 11734 545 51003 99999 NaN NaN NaN 1 11737 546 51004 99999 NaN NaN NaN 1 11740 546 51005 99999 NaN NaN NaN 1 11744 546 51006 99999 NaN NaN NaN 1 11748 544 51007 99999 NaN NaN NaN 1 11749 545 51007 99999 NaN NaN NaN 1 11750 541 51007 99999 NaN
You'll get NaN in the first 2 columns because your date and time values couldn't be convereted to doubles.

Connectez-vous pour commenter.

Plus de réponses (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by