deleting a part of a column - date to date??

5 vues (au cours des 30 derniers jours)
Micky Josipovic
Micky Josipovic le 9 Mar 2020
Commenté : Micky Josipovic le 11 Mar 2020
daten1=floor(gas_calcorr(1,1));
% daten1=datenum(2018,08,20);
% daten2=floor(gas_calcorr(end,1));
daten2=datenum(2018,08,31);
RemoveData=(gas_calcorr(daten1:daten2,7));
  3 commentaires
Micky Josipovic
Micky Josipovic le 9 Mar 2020
Yes,you are right. wrong way.. To be more specific: I just want to delete very noisy data over a period when an instruemnt was malfunctioning and unfortunately it is along vector; Goes from 01/02/2018 untill 31/08/2018. My matrix is the following
% Columns:
% 1: time
% 2: pressure drop in inlet (provides information on possible jams in inlet)
% 3: O3
% 4: SO2
% 5: NO
% 6: NOx
% 7: CO
All others are fine except CO an that must go out (become nan) within this date. My data is in 1-minute resolution and there are too much to manilpulate it in variable editor. Could you write me an example script? Thx. MJ
Benjamin Großmann
Benjamin Großmann le 9 Mar 2020
Okay, I think i got the problem and prepare a example script. Is the column 2 a criterion for the malfunction so that if column2 is true than CO should be NaN?

Connectez-vous pour commenter.

Réponse acceptée

Benjamin Großmann
Benjamin Großmann le 9 Mar 2020
clearvars
close all
clc
% lets create the date column (I only use 1 hour with increment of 1 minute), but this should works for any length
dt_datetime = [datetime(2018,02,01,14,00,00):minutes(1):datetime(2018, 02, 01, 14, 59, 59)];
dt = datenum(dt_datetime); % this should look like your first column transposed
% Generate the rest of your data as random values and attach it to the time
% vector
data_orig = [dt' rand(size(dt,2), 6)];
% now, the variable "data_orig" should have the dimensions of your gas_calcorr variable
% We now can try to manipulate the data
%% Example 1) Give specific start and end date and set the CO values (seventh column) within these dates to NaN
data1 = data_orig; % do not override the original data since we need it for another example
start_date = datenum(2018, 02, 01, 14, 20, 00);
end_date = datenum(2018, 02, 01, 14, 25, 00);
% generate a mask where the date fullfills the criterion
mask1 = (data1(:, 1) >= start_date) & (data1(:, 1) <= end_date); % creates a logical vector with 1s and 0s
% use logical indexing as row index to apply the mask:
% Set the values in the 7th column and each row where the mask is 1 to NaN
data1(mask1, 7) = NaN;
%% Example 2) Search for a criterion in column 2 and apply the mask
% do not override the original data since we may need it for another example
data2 = data_orig;
mask2 = data2(:,2) >= 0.5;
% use logical indexing as row index to apply the mask for the corresponding mask
data2(mask2, 7) = NaN;
  9 commentaires
Benjamin Großmann
Benjamin Großmann le 10 Mar 2020
Hey Micky,
your code seems fine. It could be improved at one point or another, but it gets the job done. I think, that you only looked at data points where the CO value is NaN. Remember, as I said in the earlier comment, the data that you uploaded to google drive already contains close to 200.000 NaNs. If you dont see any CO data in the plot, then the whole day contains NaNs.
Please set your daten1 variable to something like
daten1=datenum(2018, 11, 15);
to get some data for the CO plot.
If you set it to
daten1=datenum(2018, 12, 04);
you can see a gap in the data in all subplots.
Please let me know if you need further help. Do you know where these NaNs in your original data come from? We can also try to investigate the NaNs in your original data, maybe graphically.
Micky Josipovic
Micky Josipovic le 11 Mar 2020
Hi Benni,
Yes indeed - there mut have been more days with NaNs after 01/09/18 (I checked a few and not all). Thanks for your assistance, highly appreciated.
The script works now as you indicated! And to answer your question about the NaNs in the raw (and semi-cleaned data matrix):
Those are generally power cut periods, interferences with our measurments due to maintenance, checks, calibrations, etc. Indeed there are many but the main culprit is our electricity grid. Giving the entire South Africa many hours of cuts and dips... So in order for us to clean the data , all those interfereances must be flaged and cut out at the beginning of further work, one where we look at other u nrealistic and unprobable outliers and cut them out at our discretion. Desite this we have high retention of data and the case with our CO-analyser was an odd one, malfunctioned February till August (we could not get another one to replace it)...
Thank you for offering your further assistance. I am fine for now but will count on "Matlab Answers", community in the future of course.
Kind regards,
MJ

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Environment and Settings dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by