There are quite a few different conventions for binary datetime, depending on the platform or protocol. Some of these have severe drawbacks. For example, people using Unix time (seconds since Jan 1, 1970, usually in a 32-bit integer) think that they are safe until near the year 2038. But cases can and do arise where arithmetic manipulations causes serious problems. Consider the computation of the average of two datetimes, for example: if one calculates them with
averageTime = (time1 + time2)/2, there will be overflow even with dates beginning in 2004. Moreover, even if these problems don’t occur, there is the issue of conversion back and forth between different systems.
Binary datetimes differ in a number of ways: the data type, the unit, and the epoch (origin). We’ll refer to these as time scales. For example: (Sorted by epoch and unit, descending. In Java,
|MacOS X ( ||
||2001-Jan-01||seconds (and fractions thereof)|
|Java || ||1970-Jan-01||milliseconds|
|Joda || ||1970-Jan-01||milliseconds|
|Gnome || ||1970-Jan-01||microseconds|
|Windows || ||1601-Jan-01||ticks (100 nanoseconds; finest granularity in industry)|
||0001-Jan-01||ticks (100 nanoseconds; finest granularity in industry)|
|ICU Universal Time Scale|| ||0001-Jan-01||same as .Net but allows 29000BC..29000AD|
All of the epochs start at 00:00 am (the earliest possible time on the day in question), and are usually assumed to be UTC.
The ranges, in years, for different data types are given in the following table. The range for integer types includes the entire range expressible with positive and negative values of the data type. The range for double is the range that would be allowed without losing precision to the corresponding unit.
|Units||64-bit integer||Double||32-bit integer|
|100 nanoseconds (tick)||58,454.20||28.54||0.00|
ICU implements a universal time scale that is similar to the .NET framework’s System.DateTime. The universal time scale is a 64-bit integer that holds ticks since midnight, January 1st, 0001. Negative values are supported. This has enough range to guarantee that calculations involving dates around the present are safe.
The universal time scale always measures time according to the proleptic Gregorian calendar. That is, the Gregorian calendar’s leap year rules are used for all times, even before 1582 when it was introduced. (This is different from the default ICU calendar which switches from the Julian to the Gregorian calendar in 1582. See
ICU provides conversion functions to and from all other major time scales, allowing datetimes in any time scale to be converted to the universal time scale, safely manipulated, and converted back to any other datetime time scale.
So how did we decide what to use for the universal time scale? Java time has plenty of range, but cannot represent a .NET
System.DateTime value without severe loss of precision. ICU4C time addresses this by using a
double that is otherwise equivalent to the Java time. However, there are disadvantages with doubles. They provide for much more graceful degradation in arithmetic operations. But they only have 53 bits of accuracy, which means that they will lose precision when converting back and forth to ticks. What would really be nice would be a
long double (80 bits – 64 bit mantissa), but that is not supported on most systems.
The Unix extended time uses a structure with two components: time in seconds and a fractional field (microseconds). However, this is clumsy, slow, and prone to error (you always have to keep track of overflow and underflow in the fractional field).
BigDecimal would allow for arbitrary precision and arbitrary range, but we did not want to use this as the normal type, because it is slow and does not have a fixed size.
Because of these issues, we concluded that the .NET
System.DateTime is the best timescale to use. However, we use the full range allowed by the data type, allowing for datetimes back to 29,000 BC and up to 29,000 AD. (
System.DateTime uses only 62 bits and only supports dates from 0001 AD to 9999 AD). This time scale is very fine grained, does not lose precision, and covers a range that will meet almost all requirements. It will not handle the range that Java times do, but frankly, being able to handle dates before 29,000 BC or after 29,000 AD is of very limited interest.
ICU provides routines to convert from other timescales to the universal time scale, to convert from the universal time scale to other timescales, and to get information about a particular timescale. In all of these routines, the timescales are referenced using an integer constant, according to the following table:
|Java|| || |
|Unix|| || |
|ICU4C|| || |
|Windows FILETIME|| || |
|.NET DateTime|| || |
|Macintosh (old)|| || |
|Macintosh|| || |
|Excel|| || |
|DB2|| || |
|Unix with microseconds|| || |
The routine that gets a particular piece of information about a timescale takes an integer constant that identifies the particular piece of information, according to the following table:
|Precision|| || |
|Epoch offset|| || |
|Minimum “from” value|| || |
|Maximum “from” value|| || |
|Minimum “to” value|| || |
|Maximum “to” value|| || |
Here is what the values mean:
- Precision – the precision of the timescale, in ticks.
- Epoch offset – the distance from the universal timescale’s epoch to the timescale’s epoch, in the timescale’s precision.
- Minimum “from” value – the minimum timescale value that can safely be converted to the universal timescale.
- Maximum “from” value – the maximum timescale value that can safely be converted to the universal timescale.
- Minimum “to” value – the minimum universal timescale value that can safely be converted to the timescale.
- Maximum “to” value – the maximum universal timescale value that can safely be converted to the timescale.
You can convert from other timescale values to the universal timescale using the “from” methods. In ICU4C, you use
UErrorCode err = U_ZERO_ERROR; int64_t unixTime = ...; int64_t universalTime; universalTime = utmscale_fromInt64(unixTime, UDTS_UNIX_TIME, &err);
In ICU4J, you use
long javaTime = ...; long universalTime; universalTime = UniversalTimeScale.from(javaTime, UniversalTimeScale.JAVA_TIME);
You can convert values in the universal timescale to other timescales using the “to” methods. In ICU4C, you use
UErrorCode err = U_ZERO_ERROR; int64_t universalTime = ...; int64_t unixTime; unixTime = utmscale_toInt64(universalTime, UDTS_UNIX_TIME, &err);
In ICU4J, you use
long universalTime = ...; long javaTime; javaTime = UniversalTimeScale.to(universalTime, UniversalTimeScale.JAVA_TIME);
That’s all there is to it!
If the conversion is out of range, the ICU4C routines will set the error code to
U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw
IllegalArgumentException. In ICU4J, you can avoid out of range conversions by using the
long fileTime = ...; double icu4cTime = ...; BigDecimal utICU4C, utFile, utUnix, unixTime, macTime; utFile = UniversalTimeScale.bigDecimalFrom(fileTime, UniversalTime.WINDOWS_FILE_TIME); utICU4C = UniversalTimeScale.bigDecimalFrom(icu4cTime, UniversalTimeScale.ICU4C_TIME); unixTime = UniversalTimeScale.toBigDecimal(utFile, UniversalTime.UNIX_TIME); macTime = UniversalTimeScale.toBigDecimal(utICU4C, UniversalTime.MAC_TIME); utUnix = UniversalTimeScale.bigDecimalFrom(unixTime, UniversalTime.UNIX_TIME);
Note: Because the Universal Time Scale has a finer resolution than some other time scales, time values that can be represented exactly in the Universal Time Scale will be rounded when converting to these time scales, and resolution will be lost. If you convert these values back to the Universal Time Scale, you will not get the same time value that you started with. If the time scale to which you are converting uses a double to represent the time value, you may loose precision even though the double supports a range that is larger than the range supported by the Universal Time Scale.
Currently, ICU does not support direct formatting or parsing of Universal Time Scale values. If you want to format a Universal Time Scale value, you will need to convert it to an ICU time scale value first. Use
UTDS_ICU4C_TIME with ICU4C, and
UniversalTimeScale.JAVA_TIME with ICU4J.
When you parse a datetime string, the result will be an ICU time scale value. You can convert this value to a Universal Time Scale value using
UDTS_ICU4C_TIME with ICU4C, and
UniversalTime.JAVA_TIME for ICU4J.
See the previous section, Converting, for details of how to do the conversion.
To get information about a particular timescale in ICU4C, use
UErrorCode err = U_ZERO_ERROR; int64_t unixEpochOffset = utmscale_getTimeScaleValue( UDTS_UNIX_TIME, UTSV_EPOCH_OFFSET_VALUE, &err);
In ICU4J, use
long javaEpochOffset = UniversalTimeScale.getTimeScaleValue( UniversalTimeScale.JAVA_TIME, UniversalTimeScale.EPOCH_OFFSET_VALUE);
If the integer constants for selecting the timescale or the timescale value are out of range, the ICU4C routines will set the error code to
U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw