Hello,
I have a table with a series of categorical events and associated timestamp (duration datatype) at which these events happen as shown below. Typically the recordings are overnight. I want to take the timestamps and compute the differance in seconds between each successive timestamp;
Var1 Var2
... ...
'23:51:51.000' 'STAGE - N3'
'23:52:21.000' 'STAGE - R'
'00:04:51.000' 'STAGE - N1'
'00:10:21.000' 'STAGE - N2'
... ...
I have tried a couple of methods with etime() and between() etc. without much success. Seconds() and diff() works as shown below, however at midnight the roll over from 24:59:59 to 00:00:00 causes a large negitive roll over error (shown below). Is there a built in function that can handle this?
seconds(diff(myTable{:,1}))
...
30
-85650
330
30
...
Kind regards,
Christopher

 Réponse acceptée

There is no concept of days with a duration variable, so there are 2 options that I could think of.
  1. Add 86400 seconds to any negative values
  2. Use datetimes instead of durations (especially if you have dates associated with your times)
dur = duration({'23:51:51.000'
'23:52:21.000'
'00:04:51.000'
'00:10:21.000'});
ds = seconds(diff(dur))
ds = 3×1
30 -85650 330
% Add one day of seconds to any negative number
ds(ds<0) = ds(ds<0) + 24*60*60
ds = 3×1
30 750 330
% use datetimes
dur = datetime({'23:51:51.000'
'23:52:21.000'
'00:04:51.000'
'00:10:21.000'});
d = [0; diff(dur)<0];
dt = dur + days(cumsum(d))
dt = 4×1 datetime array
21-Jul-2022 23:51:51 21-Jul-2022 23:52:21 22-Jul-2022 00:04:51 22-Jul-2022 00:10:21
seconds(diff(dt))
ans = 3×1
30 750 330

4 commentaires

First of all, thank you all for your answers and guidance. I don't typically work with time or timedate data, usually its signal processing so the extra insight is great!
While I understand why the wrapping of duration() occurs, I am a little surprised that it’s not a more common problem. A duration(x,’wrap’) or similar could be a nice feature for automated handling.
The data is ascending as it’s a real time recording. Durations longer than the next largest unit (61 seconds etc.) are interesting, however I think this could be problematic in the long run in terms of maintaining the codebase.
Unfortunately I don’t have the associated date for conversion into timedate. While I could assigned two random successive days, its more work than just finding the negative number, which is why I’ve accepted Cris’ answer. As the minimum duration should never be less than 0 this should be robust. I had implemented a similar but less elegant mathematical correction. I am happy that I hadn’t been missing something obvious though!
Thank you all for the help!
Christopher
Christopher, you have kind of a mix of situations here, and that's what's causing trouble. I should say that this is an issue with the data, not with datetime or duration.
AFAICT, you have timestamps that only record the time of day, so they wrap back to 00:00:00 every 24 hrs. But your timestamps actually represent monotonically increasing times. So you'd either need to attach a date portion, which you don't have, or increment each day's data by 24hrs, which is what Cris is suggesting.
"Unfortunately I don’t have the associated date for conversion into timedate ... [and] durations longer than the next largest unit (61 seconds etc.) are interesting, however I think this could be problematic in the long run": Those are fundamentally conflicting requirements. Without something to indicate days, there's nothing ... to indcate days other than "they are in oreder so when there's a negative jump, it's a new day". It's not entirely clear what you have in mind for duration(x,’wrap’). Duration is elapsed time so the notion that it should know anything about wrapping at midnight is just fundamentally at odds with what it is.
"The data is ascending as it’s a real time recording.": I would argue that this is what makes using durations that exceed 24hrs exactly the right thing to do. Here's something similar to what Cris showed, going all old-school with checking for negative diffs:
dt = datetime(2022,7,21,20:30,0,0) % what really happened
dt = 1×11 datetime array
21-Jul-2022 20:00:00 21-Jul-2022 21:00:00 21-Jul-2022 22:00:00 21-Jul-2022 23:00:00 22-Jul-2022 00:00:00 22-Jul-2022 01:00:00 22-Jul-2022 02:00:00 22-Jul-2022 03:00:00 22-Jul-2022 04:00:00 22-Jul-2022 05:00:00 22-Jul-2022 06:00:00
dur = timeofday(dt) % what you have
dur = 1×11 duration array
20:00:00 21:00:00 22:00:00 23:00:00 00:00:00 01:00:00 02:00:00 03:00:00 04:00:00 05:00:00 06:00:00
incr = hours(24) * [0 (diff(dur) < 0)]
incr = 1×11 duration array
0 hr 0 hr 0 hr 0 hr 24 hr 0 hr 0 hr 0 hr 0 hr 0 hr 0 hr
dur = dur + cumsum(incr) % what I claim you want
dur = 1×11 duration array
20:00:00 21:00:00 22:00:00 23:00:00 24:00:00 25:00:00 26:00:00 27:00:00 28:00:00 29:00:00 30:00:00
One more thing: you mention etime and between. I really strongly recommend sticking with datetimes and durations, not mixing in datenums and datevecs. In the current doc, etime is even discouraged use.
Hi Peter,
Thanks for the response. I understand what you are saying, and agree with most of it however let me present the problem in a slightly different light.
  1. Why does my data not have an associated date: Data were collected for the formation of a large database. Prior to allowing ‘outside’ use, the identification of personal information and removal process took place. Current best practice - the panel took the decision to remove all date information to ensure anonymity to participants.
  2. I agree what you are saying in terms of datetime and duration, if the time is provided with an associated date. Then this acts like a linear timeline which does not and should not wrap.
  3. However, when given as only a time or duration (with no associated date) wrapping at midnight should occur. Why; If I asked you the difference between 23:55:00 and 00:05:00, you would tell me ten minutes, not -23:00:00, as this is the rotational nature of an analogue clock. I now believe the issue here is really how these values are interpreted. I.e. MATLAB seems to consider a time only duration/timedate as having an associated date that was never specified by the user i.e. 01/01/1980 or similar, when this was never given (and cannot be seen or queried by the user?). I believe the circularity - considering time only - was the point @Star Strider was making too, but I will let him comment on that.
  4. If given a list of timestamps (which do not include dates) which exceed several days IMO there should be an inbuilt input argument to handle this rollover. As without a date, the rollover is inherently guaranteed. This is what I was proposing with D = duration(X,’wrap’). Note, I am only proposing this for instances where a time only (no date) timestamp is given.
  5. Looking further to R, Python and Excel there are several packages to handle this so I do now believe that this is justified.
I hope that makes my rational somewhat clearer, and thankyou for the reply too @Peter Perkins, its always really good to chat with MATLAB staff and experienced users to understand more about MATLAB and see new features and options.
I have implemented what Cris showed and it is working. (Though I do appreciate seeing the old school work around) Noted about etime, I will not use it.
Kind regards,
Christopher
"MATLAB seems to consider a time only duration/timedate as having an associated date that was never specified by the user i.e. 01/01/1980 or similar, when this was never given (and cannot be seen or queried by the user?)"
These three things are different:
datetime("10:39:04",Format="HH:mm:ss") % assumes current date
ans = datetime
10:39:04
datetime("10:39:04",InputFormat="HH:mm:ss") % assumes current date
ans = datetime
25-Jul-2022 10:39:04
duration("10:39:04",Format="hh:mm:ss")
ans = duration
10:39:04
The first is often a bad idea, although "cannot be seen or queried by the user?" is overstating it. The second is convenient for interactive use, but a terrible idea for programmatic use for obvious reasons. The third seems like what you want, except I think that you are looking for
d1 = duration(["00:00:00" "08:00:00" "16:00:00" "00:00:00 " "08:00:00" "16:00:00" "00:00:00"],Format="hh:mm:ss")
d1 = 1×7 duration array
00:00:00 08:00:00 16:00:00 00:00:00 08:00:00 16:00:00 00:00:00
to automatically increment by multiples of 24hrs as appropriate, and behave instead "as if" calling something that returns this
d2 = duration(["00:00:00" "08:00:00" "16:00:00" "24:00:00 " "32:00:00" "40:00:00" "48:00:00"],Format="hh:mm:ss")
d2 = 1×7 duration array
00:00:00 08:00:00 16:00:00 24:00:00 32:00:00 40:00:00 48:00:00
It's possible that you are looking to call something that returns something that is like d1 but "knows" that it is monotonic. We generally stay away from that kind of thing because in that case it would be an array that contains values that look like one thing but behave like another. There are a lot of potential pitfalls in doing that.

Connectez-vous pour commenter.

Plus de réponses (1)

Unfortunately, duration values are limited to one day, and reset at midnight. There does not appear to be any workaround for that.
Example —
t = datetime(2022,01,01) + minutes(0:2880).';
s = sin(2*pi*(0:numel(t)-1)/720);
figure
plot(t,s)
grid
tmin = t;
tmin.Format = 'HH:mm:ss';
dt = duration(string(tmin));
figure
plot(dt,s)
grid
I would just keep them as datetime values (or convert them to datetime values), and do any analyses using that result.
.

3 commentaires

Unfortunately, duration values are limited to one day, and reset at midnight.
I don't agree with this statement as written. You can have a duration array spanning more than 24 hours. Any of the components can be greater than the equivalent of 1 unit in the next largest component (like 61 seconds, 61 minutes, or 25 hours.)
y = duration(25, 61, 61)
y = duration
26:02:01
More accurate IMO would be that data representing times of day as hours, minutes, and seconds, when converted to duration arrays, have that discontinuity at midnight.
If your data is "sorted ascending" (with any drop representing crossing that midnight discontinuity) you can use diff to identify where those discontinuities occur and add 24 hours until the time is really sorted ascending.
dt = datetime('today')+minutes([-5 5])
dt = 1×2 datetime array
20-Jul-2022 23:55:00 21-Jul-2022 00:05:00
du = timeofday(dt)
du = 1×2 duration array
23:55:00 00:05:00
difference = [0, diff(du)]
difference = 1×2 duration array
00:00:00 -23:50:00
du(difference < 0) = du(difference < 0) + hours(24)
du = 1×2 duration array
23:55:00 24:05:00
@Steven Lord — the duration arrays I’ve worked with, although spanning more than one day (as illustrated in my answer here) do not automatically respect days, so in this example, while the datetime values span two days, the duration values wrap at 24 hours.
When I converted my datetime values to duration values (using the string function), the duration function threw an error saying that it does not recognise 'dd/MM/yyyy HH:mm:ss' format that my datetime vector automatically provided, and specifically suggested using only the 'HH:mm:ss' section of the datetime array.
So, I did.
I have no idea what a workaround for that would be, other than to work only with datetime arrays with data that span multiple days, such as in situations like this.
The duration vector can include multiple days, however the identities of the days themselves are lost —
t = datetime(2022,01,01) + minutes(0:2880).';
s = sin(2*pi*(0:numel(t)-1)/720);
figure
plot(t,s)
grid
tmin = t;
tmin.Format = 'dd:HH:mm:ss';
dt = duration(string(tmin));
figure
plot(dt,s)
grid
.
SS, durations don't wrap at 24hrs, they really don't.
I mean, I know you know this, but just to make sure we're on the same page, datetimes represent points along the real-world time line, and durations represent elapsed time. Another interpretation is that durations repesent time along an "engineering time line" that is not measured with calendar units. In other words, elapsed time from some origin.
"converted my datetime values to duration values (using the string function)": I don't see the code for what I guess was your original stab at that, but if it was something like
dur = duration(string(dt))
then yes, duration will throw an error because the text that string returns is not in a format that the duration ctor understands: it will (almost certainly) have a y/m/d portion, and duration has no notion of calendars.
This
dur = duration(string(dt,'dd:HH:mm:ss'))
is sort of a clever hack around that error, but its kind of an accident: dd means "day of month" in datetime formats, and your datetimes start at 1-Jan, and so you get day numbers from 1 to whatever. dd means "elapsed days" in duration formats. That's why you see the ticks on your second plot like that, starting at 24hr.
There are two recommended ways to get durations from datetimes, depending on what you are wanting to do:
1) chop them off as times of day, i.e. all in [0,24) hrs: timeofday(dt)
dt = datetime(2022,7,22,0:8:48,0,0);
dur = timeofday(dt)
dur = 1×7 duration array
00:00:00 08:00:00 16:00:00 00:00:00 08:00:00 16:00:00 00:00:00
2) get them as a monotonically increasing sequence
dur = dt - dt(1)
dur = 1×7 duration array
00:00:00 08:00:00 16:00:00 24:00:00 32:00:00 40:00:00 48:00:00
So:
t = datetime(2022,01,01) + minutes(0:2880).';
s = sin(2*pi*(0:numel(t)-1)/720);
plot(t,s)
plot(t-t(1),s)
plot(t,s,DatetimeTickFormat='hh:mm:ss')

Connectez-vous pour commenter.

Produits

Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by