flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kent Murra (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-7735) Improve date/time handling in publically-facing Expressions
Date Thu, 28 Sep 2017 18:17:00 GMT
Kent Murra created FLINK-7735:

             Summary: Improve date/time handling in publically-facing Expressions
                 Key: FLINK-7735
                 URL: https://issues.apache.org/jira/browse/FLINK-7735
             Project: Flink
          Issue Type: Wish
          Components: Table API & SQL
            Reporter: Kent Murra
            Priority: Minor

I would like to discuss potential improvements to date/time/timestamp handling in Expressions.
 Since Flink now offers expression push down for table sources, which includes time-related
functions, timezone handling is more visible to the end user.

I think that the current usage of java.sql.Time, java.sql.Date, and java.sql.Timestamp are
fairly ambiguous.  We're taking a Date subclass in the constructor of Literal, and assuming
that the year, month, day, and hour fields apply to UTC rather than the user's default timezone.
  Per that assumption, Flink is [adjusting the value of the epoch timestamp|https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/expressions/literals.scala#L106]
silently when converting to the RexLiteral.  This provides correct behavior if the user assumes
that the year/month/day/hour fields in the Date object are the same timezone that the SQL
statement assumes (which is UTC).  However, if they work at all with the epoch timestamp (which
is a public field) this can lead to incorrect results.  Moreover, its confusing if you're
considering the time zones your data is in, requiring some amount of research to determine
correct behavior.

It would be ideal to:

# Provide primitives that have time-zone information associated by default, thus disambiguating
the times. 
# Properly document all TimeZone related assumptions in Expression literals.  
# Document that the TIMESTAMP calcite function will assume that the timestamp is in UTC in
web documentation.  
# Having a timezone based date parsing function in the SQL language.

Regarding the primitives, since we have to support Java 7, we can't use Java 8 time API. 
I'm guessing it'd be a decision between using Joda Time or making thin data structures that
could easily be converted to various other time primitives.

This message was sent by Atlassian JIRA

View raw message