# How can I determine where there are date gaps in an array of date strings? And fill in the gaps...

64 views (last 30 days)
Red Beard on 24 Jan 2013
Answered: Peter Perkins on 3 Jan 2017
Dear All,
I'm a bit of a Matlab novice and I'm really struggling with the following problem. I really appreciate any help you can give.
I have an array consisting of dates and corresponding data entries for each date (note that the date entries are strings). This is ordered in a predetermined ascending order dependent on the date. It looks something like this:
Date/Time Value Value Value
1. 2. 3. 4. etc...
1. 01.01.2012 03:00 -0.0046 -0.0056 0.0024
2. 02.01.2012 03:00 -0.0047 -0.0051 0.0023
3. 03.01.2012 03:00 -0.0042 -0.0053 0.0021
4. 05.01.2012 03:00 -0.0038 -0.0049 0.0018
5. 06.01.2012 03:00 -0.0036 -0.0045 0.0017
etc...
However, the dates (and corresponding data) have gaps (e.g. lines 3-4 above). This is causing me a problem when I come to create a plot; the dates and corresponding data are plotted linearly but I need to be able to distinguish on the plot where the date gaps are. [See example plot with problem].
My idea is to fill these gaps with blank data. So in the above example table, I need to insert a row between 3. and 4. for the 04.01.2012 date with data values of 0 (zero). That way they are easy to distinguish on the plot.
Unfortunately, I just don't know how to do this. Any ideas?

Arthur on 24 Jan 2013
I think you can do this without an loop. Since you want to get an 'ideal' date array, you can just make this, and then lookup where your real data fits in the ideal date array. Something like this:
datenumbers = cellfun(@(x) datenum(x,'dd.mm.yyyy HH:MM'), dates);
requiredDates = datenumbers(1):datenumbers(end); %monotonic increasing array of all dates you want
idx = arrayfun(@(x) find(x == requiredDates),datenumbers); %finds where your real dates are in requiredDates
%create output arrays:
n = length(requiredDates);
dataOut = nan(1,n);
%seed input data in dataOut, at the correct location:
dataOut(idx) = dataIn;
Missing data will be NaN.

Red Beard on 29 Jan 2013
Hi Arthur,
Thanks for your response. Can you please explain this a bit more. I'm getting the following error when I try to run:
Error using arrayfun
Non-scalar in Uniform output, at index 2, output 1.
Set 'UniformOutput' to false.
Am I correct in saying that x should be my array of dates? Or should it be dates?
Thanks.
Arthur on 31 Jan 2013
No this is an anonymous function inside the arrayfun. In my code, I assumed that your dates are in a n-by-1 cell array called 'dates'. My code should run without errors, maybe your data is stored in a slightly different way. How is the data currently organized?
Ravi Pandit on 28 Dec 2016
I tried this with my data which is nX1 cell arry but still not working. My dates format is given below,
'2009-04-10 00:10:00.000'
'2009-04-10 00:20:00.000'
'2009-04-10 00:30:00.000'
'2009-04-10 00:40:00.000'
'2009-04-10 00:50:00.000'
Can you tell me where is mistake I am making??

maria on 24 Jan 2013
May be there are smarter methods, but I think you can create a simple loop to examining each row, comparing actual date with last one (can use "datenum" to convert string in number format) and, if its't sequential, modify the matrix joining the first part with one or more new rows (whit certain data and zero values) and with last part of the matrix, and continue the inspection. If you have problems to do it show your test code and I will try to help.

Cedric Wannaz on 24 Jan 2013
A little variation of this loop. If the time frame of your data is nDays days, you could build a matrix of zeros or NaN, and set its rows or columns only for days when you have a data point. For example..
data = ..
dataWithGaps = nan(nDays, nHeights) ; % or zeros(..).
for ii = 1 : size(data, 1)
dayId = ... % Whichever solution you choose
% for assoc. indices to days.
dataWithGaps(dayId,:) = data(ii, :) ;
end
maria on 26 Jan 2013
That's really better. Thanks!

Peter Perkins on 3 Jan 2017
One solution to this question, as I understand it, would be to make the plot using dates, not indices. Using datetime, a very simplified version goes like this and creates the attached figure, with the gap at April that you're looking for:
>> Date = datetime(2012,[1;2;3;5;6],1,3,0,0);
>> Value = randn(5,1);
>> plot(Date,Value,'o')
Another solution if you have access to the latest release, R2016b, is to use a timetable, and then plot from that:
>> tt = timetable(Date,Value)
tt =
Date Value
____________________ ________
01-Jan-2012 03:00:00 -0.20497
01-Feb-2012 03:00:00 -0.12414
01-Mar-2012 03:00:00 1.4897
01-May-2012 03:00:00 1.409
01-Jun-2012 03:00:00 1.4172
>> newDates = min(Date):calmonths(1):max(Date);
>> tt = retime(tt,newDates,'FillWithConstant','Constant',0)
tt =
Date Value
____________________ ________
01-Jan-2012 03:00:00 -0.20497
01-Feb-2012 03:00:00 -0.12414
01-Mar-2012 03:00:00 1.4897
01-Apr-2012 03:00:00 0
01-May-2012 03:00:00 1.409
01-Jun-2012 03:00:00 1.4172
Prior to R2016b you can do that "by hand", using ismember(Date,newDates), but timetable makes it easy. Hope this helps.