drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vdiravka <...@git.apache.org>
Subject [GitHub] drill pull request #656: DRILL-5034: Select timestamp from hive generated pa...
Date Sun, 29 Jan 2017 17:55:50 GMT
Github user vdiravka commented on a diff in the pull request:

    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java
    @@ -323,18 +323,28 @@ public static DateCorruptionStatus checkForCorruptDateValuesInStatistics(Parquet
        * @param binaryTimeStampValue
        *          hive, impala timestamp values with nanoseconds precision
        *          are stored in parquet Binary as INT96 (12 constant bytes)
    -   *
    +   * @param retainLocalTimezone
    +   *          parquet files don't keep local timeZone according to the
    +   *          <a href="https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#timestamp">Parquet
    +   *          but some tools (hive, for example) retain local timezone for parquet files
by default
    +   *          Note: Impala doesn't retain local timezone by default
        * @return  Unix Timestamp - the number of milliseconds since January 1, 1970, 00:00:00
        *          represented by @param binaryTimeStampValue .
    -    public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue) {
    +    public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue, boolean
retainLocalTimezone) {
           // This method represents binaryTimeStampValue as ByteBuffer, where timestamp is
stored as sum of
           // julian day number (32-bit) and nanos of day (64-bit)
           NanoTime nt = NanoTime.fromBinary(binaryTimeStampValue);
           int julianDay = nt.getJulianDay();
           long nanosOfDay = nt.getTimeOfDayNanos();
    -      return (julianDay - JULIAN_DAY_NUMBER_FOR_UNIX_EPOCH) * DateTimeConstants.MILLIS_PER_DAY
    +      long dateTime = (julianDay - JULIAN_DAY_NUMBER_FOR_UNIX_EPOCH) * DateTimeConstants.MILLIS_PER_DAY
               + nanosOfDay / NANOS_PER_MILLISECOND;
    +      if (retainLocalTimezone) {
    +        return new org.joda.time.DateTime(dateTime, org.joda.time.chrono.JulianChronology.getInstance())
    +            .withZoneRetainFields(org.joda.time.DateTimeZone.UTC).getMillis();
    --- End diff --
    `withZoneRetainFields` method calculates the difference between local timezone and UTC
(parameter of that method) and returns original dateTime with a shift of that difference.
This approach is used frequently in drill code.
    But thinking a little more on this I decided that it is possible to use more simpler statement,
without creating DateTime object. 
    `DateTimeZone.getDefault().convertUTCToLocal(dateTime)`. I think it's more clear.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message