Specifying a point in time is a pretty common task. And it is extraordinary difficult. Lets have a look at the different ways to specify a point in time.
Imagine the following scenario: We record some events, e.g. actions of a credit card user. In case of potential fraud, an automatic system locks the card and a human has a look at the actions and times. The human monitoring system requests all data related to this credit card user. Of course, most if it has time stamps. As it is another system, the time has to be transfered from the recording system to the monitoring system. And it might well be that one does not want to give direct database access but rather a REST-like API endpoint. Thus the way to serialize the datetime has to be a string.
Now... how do you encode a datetime to a string?
Unix Time
The Unix Time is the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, minus the number of leap seconds that have taken place since then.
Problems:
- ⊝ Not human-readable
- ⊝ Year 2038 problem for 32-bit systems.
- ⊝ Everything before the 1st of January in 1970 is not representable
Especially the last point is a killer criterium against Unix time in many applications.
Calendar systems
Most countries use the Gregorian calendar today,[1] so the type of calendar is at least something we don't have to worry about. But then the problems begin. I thought ISO 8601 would be a simple, human-readable, unabiguous format that looks like this:
2018-01-27T16:45:01+00:00
2018-01-27T16:45:01+00:00
1990-12-31T23:59:59+00:00
But due to leap seconds / days and due to changes in timezones I'm pretty sure there is one point in time which has two different ISO 8601 time stamps. Please let me know in the comments if you see one (best if it can be shown with some programming languages).
So the TL;DR about ISO 8601:
- ⊕ Lexicographical ordering is also the ordering by time.
- ⊕ It is human readable
- ⊝ ISO 8601 does not only have
2018-01-27T16:45:01+00:00
but also2018-01-27T16:45:01Z
and20180127T164501Z
were Z means Zero offset for UTC. So there are multiple representations for the same point in time. - ⊝ Anything before
0000-01-01T00:00:00+00:00
is not representable.[2] - ⊝ Year zero doesn't exist
Falsehood Programmer beliefies
- Months have either 30 or 31 days.
- False! You forgot February.
- Years have 365 days.
- False! You forgot February again! There are leap years.
- A given geographic point (e.g. a city) has always the same time zone.
- False! You forgot DST.
Python
Let's check that with Python:
import datetime
>>> datetime.datetime(0, 11, 1).isoformat()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: year is out of range
>>> datetime.datetime(1, 11, 1).isoformat()
'0001-11-01T00:00:00'
>>> datetime.datetime(9999, 11, 1).isoformat()
'9999-11-01T00:00:00'
>>> datetime.datetime(99999, 11, 1).isoformat()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: year is out of range
>>> datetime.datetime(2017, 12, 24).isoformat()
'2017-12-24T00:00:00'
>>> datetime.datetime(2017, 12, 24, tzinfo=datetime.timezone.utc).isoformat()
'2017-12-24T00:00:00+00:00'
>>> a = datetime.datetime(2017, 12, 24)
>>> b = datetime.datetime(2017, 12, 24, tzinfo=datetime.timezone.utc)
>>> a - b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't subtract offset-naive and offset-aware datetimes
## Parsing stuff
>>> datetime.datetime.strptime('0020-11-01T00:00:00', '%Y-%m-%dT%H:%M:%S')
datetime.datetime(20, 11, 1, 0, 0)
By now, Datetime parsing with Python is usually done like this:
from dateutil import parser as dateparser
date_obj = dateparser.parse("Tue May 08 15:14:45 +0800 2012")
Things that are unintuitive for me:
- ⊝ Why don't all datetime objects have timezone information? (See SO question)
- ⊝ Python can create a format it can't consume again.
See also:
- Python's strftime directives
- Olson database
- PEP 321: Date/Time Parsing and Formatting
- PEP 431: Time zone support improvements (Withdrawn)
pytz
: In case you need anything for time zones- Working with datetime objects and timezones in Python
JavaScript
var d = new Date();
console.log(d.toJSON());
gives 2018-01-27T22:51:17.691Z
. So it is a ISO 8601 datetime with miliseconds
and zero offset. Nice!
You can also find this in the ECMA specification:
ECMAScript defines a string interchange format for date-times based upon a simplification of the ISO 8601 Extended Format. The format is as follows:
YYYY-MM-DDTHH:mm:ss.sssZ
PHP5
$ php -a
Interactive mode enabled
php > $dt = new DateTime();
php > echo $dt->format(DateTime::ATOM);
2018-01-27T23:58:12+01:00
You can see that PHP gives you the ISO 8601 string with the zone offset, but not the milliseconds.
Footnotes
- [1] I've heard the Islamic calendar is used in Saudi Arabia.
- [2] Or only by mutual agreement... which is the same as breaking the standard, if you ask me.