drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chun Chang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5360) Timestamp type documented as UTC, implemented as local time
Date Mon, 05 Feb 2018 19:57:00 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chun Chang updated DRILL-5360:
    Priority: Critical  (was: Major)

> Timestamp type documented as UTC, implemented as local time
> -----------------------------------------------------------
>                 Key: DRILL-5360
>                 URL: https://issues.apache.org/jira/browse/DRILL-5360
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Priority: Critical
>             Fix For: 2.0.0
> The Drill documentation implies that the {{Timestamp}} type is in UTC:
> bq. JDBC timestamp in year, month, date hour, minute, second, and optional milliseconds
format: yyyy-MM-dd HH:mm:ss.SSS. ... TIMESTAMP literals: Drill stores values in Coordinated
Universal Time (UTC). Drill supports time functions in the range 1971 to 2037. ... Drill does
not support TIMESTAMP with time zone.
> The above is ambiguous. The first part talks about JDBC timestamps. From the JDK Javadoc:
> bq. Timestamp: A thin wrapper around java.util.Date. ... Date class is intended to reflect
coordinated universal time (UTC)...
> So, a JDBC timestamp is intended to represent time in UTC. (The "indented to reflect"
statement leaves open the possibility of misusing {{Date}} to represent times in other time
zones. This was common practice in early Java development and was the reason for the eventual
development of the Joda, then Java 8 date/time classes.)
> The Drill documentation implies that timestamp *literals* are in UTC, but a careful read
of the documentation does allow an interpretation that the internal representation can be
other than UTC. If this is true, then we would also rely on a liberal reading of the Java
`Timestamp` class to also not be UTC. (Or, we rely on the Drill JDBC driver to convert from
the (unknown) server time zone to a UTC value returned by the Drill JDBC client.)
> Still, a superficial reading (and common practice) would suggest that a Drill Timestamp
should be in UTC.
> However, a test on a Mac, with an embedded Drillbit (run in the Pacific time zone, with
Daylight Savings Time in effect) shows that the Timestamp binary value is actual local time:
> {code}
>       long before = System.currentTimeMillis();
>       long value = getDateValue(client, "SELECT NOW() FROM (VALUES(1))" );
>       double hrsDiff = (value - before) / (1000.00 * 60 * 60);
>       System.out.println("Hours: " + hrsDiff);
> {code}
> The above gets the actual UTC time from Java. Then, it runs a query that gets Drill's
idea of the current time using the {{NOW()}} function. (The {{getDateValue}} function uses
the new test framework to access the actual {{long}} value from the returned value vector.)
Finally, we compute the difference between the two times, converted to hours. Output:
> {code}
> Hours: -6.9999975
> {code}
> As it turns out, this is the difference between UTC and PDT. So, the time is in local
time, not UTC.
> Since the documentation and implementation are both ambiguous, it is hard to know the
intent of the Drill Timestamp. Clearly, common practice is to use UTC. But, there is wiggle-room.
> If the Timestamp value is supposed to be local time, then Drill should provide a function
to return the server's time zone offset (in ms) from UTC so that the client can to the needed
local-to-UTC conversion to get a true timestamp.
> On the other hand, if the Timestamp is supposed to be UTC (per common practice), then
{{NOW()}} should not report local time, it should return UTC.
> Further, if {{NOW()}} returns local time, but Timestamp literals are UTC, then it is
hard to see how any query can be rationally written if one timestamp value is local, but a
literal is UTC.
> So, job #1 is to define the Timestamp semantics. Then, use that to figure out where the
bug lies to make implementation consistent with documentation (or visa-versa.)

This message was sent by Atlassian JIRA

View raw message