impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (JIRA)" <>
Subject [jira] [Created] (IMPALA-5609) Refactor TimestampFunctions and TimestampValue
Date Fri, 30 Jun 2017 18:13:00 GMT
Matthew Jacobs created IMPALA-5609:

             Summary: Refactor TimestampFunctions and TimestampValue
                 Key: IMPALA-5609
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 2.9.0
            Reporter: Matthew Jacobs

The Timestamp-related code in the backend is getting very confusing. We have several logical
groups of timestamp related code:
* TIMESTAMP functions (i.e. public functions)
* The Timestamp types: TimestampValue and it's UDF cousin TimestampVal
* Various conversion functions between different kinds of types/formats e.g. TimestampVal,
TimestampValue, unix times, ptimes, strings w/ formats.

Things got particularly hairy when the {{-use_local_tz_for_unix_timestamp_conversions}} flag
was added. The purpose of that flag was to enable Hive/MySQL compatibility for the {{unix_timestamp(...)}}
and {{from_unixtime(...)}} SQL functions. However, the logic that handles this flag (thus
determining whether or not to convert to/from local time <-> UTC) lives in TimestampValue.
This was a mistake: a TimestampValue should only represent a ptime. This has led to a lot
of confusion and bugs when code using TimestampValue ended up getting unwanted timezone conversions.

We should clean up the code by:
1) ensuring that TimestampFunctions contains only the SQL functions that get exposed. The
handling of {{-use_local_tz_for_unix_timestamp_conversions}} should be in TimestampFunctions,
because it should only affect SQL functions and not the raw data types themselves.
2) move all the 'Unix time' logic out of TimestampValue and into a TimestampUtil class. It
can expose Unix time functions for cases involving local time and UTC times, and TimestampFunctions
can call the appropriate functions. The timezones should not bleed into TimestampValue.
3) TimestampValue should basically just be a wrapper around a ptime.

This message was sent by Atlassian JIRA

View raw message