hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Barna Zsombor Klara (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug
Date Tue, 07 Feb 2017 14:46:41 GMT

     [ https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Barna Zsombor Klara updated HIVE-12767:
---------------------------------------
    Attachment: HIVE-12767.6.patch

Fixed a regression I caused when refactoring the NanoTimeUtils.
To prevent the adjustment we need to use the GMT timezone for impala written parquet files.
This should take care of the parquet qtest failures.

> Implement table property to address Parquet int96 timestamp bug
> ---------------------------------------------------------------
>
>                 Key: HIVE-12767
>                 URL: https://issues.apache.org/jira/browse/HIVE-12767
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 2.0.0
>            Reporter: Sergio Peña
>            Assignee: Barna Zsombor Klara
>         Attachments: HIVE-12767.3.patch, HIVE-12767.4.patch, HIVE-12767.5.patch, HIVE-12767.6.patch,
TestNanoTimeUtils.java
>
>
> Parque timestamps using INT96 are not compatible with other tools, like Impala, due to
issues in Hive because it adjusts timezones values in a different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) must be used
in Hive that detects what timezone to use when writing and reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a time zone
from a table property, if set, or using the local time zone if it is absent. No adjustment
will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from the same
table property, if set, or using the local time zone if it is absent. This keeps the data
in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global option
to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set the property
unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE <OTHER TABLE> will copy the property
of the table that is copied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message