Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 87CCF200C16 for ; Thu, 9 Feb 2017 18:45:37 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8638B160B50; Thu, 9 Feb 2017 17:45:37 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AB48E160B4B for ; Thu, 9 Feb 2017 18:45:36 +0100 (CET) Received: (qmail 62887 invoked by uid 500); 9 Feb 2017 17:45:35 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 62869 invoked by uid 99); 9 Feb 2017 17:45:35 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Feb 2017 17:45:35 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 826AC2DF8A5; Thu, 9 Feb 2017 17:45:34 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7528285764965928282==" MIME-Version: 1.0 Subject: Re: Review Request 56334: HIVE-12767: Implement table property to address Parquet int96 timestamp bug From: Sergio Pena To: Sergio Pena , Ryan Blue Cc: Barna Zsombor Klara , Peter Vary , hive Date: Thu, 09 Feb 2017 17:45:34 -0000 Message-ID: <20170209174534.31773.31949@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Sergio Pena X-ReviewGroup: hive X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/56334/ X-Sender: Sergio Pena References: <20170208214846.31774.20436@reviews.apache.org> In-Reply-To: <20170208214846.31774.20436@reviews.apache.org> X-ReviewBoard-Diff-For: ql/src/test/queries/clientpositive/parquet_int96_timestamp.q X-ReviewBoard-Diff-For: ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestParquetTimestampConverter.java X-ReviewBoard-Diff-For: ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out X-ReviewBoard-Diff-For: ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java X-ReviewBoard-Diff-For: ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java X-ReviewBoard-Diff-For: ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java X-ReviewBoard-Diff-For: data/files/impala_int96_timestamp.parq X-ReviewBoard-Diff-For: ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java Reply-To: Sergio Pena X-ReviewRequest-Repository: hive-git archived-at: Thu, 09 Feb 2017 17:45:37 -0000 --===============7528285764965928282== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Feb. 8, 2017, 9:48 p.m., Sergio Pena wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java, line 183 > > > > > > Should we log a warning and keep the timeZoneID to the default instead? I don't know if it is good to avoid reading a table that has an invalid timezone property. > > Barna Zsombor Klara wrote: > I think an invalid timezone means that the user made a typo when the property was being set. Since this can happen transparently, without anyone noticing I think we should prevent both the reading and the writing of the data. But I can be convinced otherwise. Thanks. I agree with you. - Sergio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56334/#review164773 ----------------------------------------------------------- On Feb. 9, 2017, 5:23 p.m., Barna Zsombor Klara wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56334/ > ----------------------------------------------------------- > > (Updated Feb. 9, 2017, 5:23 p.m.) > > > Review request for hive, Ryan Blue and Sergio Pena. > > > Bugs: HIVE-12767 > https://issues.apache.org/jira/browse/HIVE-12767 > > > Repository: hive-git > > > Description > ------- > > This is a followup on this review request: https://reviews.apache.org/r/41821 > The following exit criteria is addressed in this patch: > > - Hive will read Parquet MR int96 timestamp data and adjust values using a time zone from a table property, if set, or using the local time zone if it is absent. No adjustment will be applied to data written by Impala. > - Hive will write Parquet int96 timestamps using a time zone adjustment from the same table property, if set, or using the local time zone if it is absent. This keeps the data in the table consistent. > - New tables created by Hive will set the table property to UTC if the global option to set the property for new tables is enabled. > - Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set the property unless the global setting to do so is enabled. > - Tables created using CREATE TABLE LIKE will copy the property of the table that is copied. > > To set the timezone table property, use this: > create table tbl1 (ts timestamp) stored as parquet tblproperties ('parquet.mr.int96.write.zone'='PST'); > > To set UTC as default timezone table property on new tables created, use this: > set parquet.mr.int96.enable.utc.write.zone=true; > create table tbl2 (ts timestamp) stored as parquet; > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java cb27cd6236dc1d54efd2cafda9f1cc975e9d9817 > data/files/impala_int96_timestamp.parq PRE-CREATION > itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java a14b7900afb00a7d304b0dc4f6482a2b87716919 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 379a9135d9c631b2f473976b00f3dc87f9fec0c4 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 167f9b6516ac093fa30091daf6965de25e3eccb3 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 76d93b8e02a98c95da8a534f2820cd3e77b4bb43 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java 604cbbcc2a9daa8594397e315cc4fd8064cc5005 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java ac430a67682d3dcbddee89ce132fc0c1b421e368 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 3fd75d24f3fda36967e4957e650aec19050b22f8 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java 699de59b58ecbfa0705f042d22e697f1573ea496 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java 3d5c6e6a092dd6a0303fadc6a244dad2e31cd853 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java f4621e5dbb81e8d58c4572c901ec9d1a7ca8c012 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 6b7b50a25e553629f0f492e964cc4913417cb500 > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 934ae9f255d0c4ccaa422054fcc9e725873810d4 > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java 833cfdbaa2ac6a3653a8f3232344f5fb70c80686 > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java PRE-CREATION > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java ec6def5b9ac5f12e6a7cb24c4f4998a6ca6b4a8e > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java PRE-CREATION > ql/src/test/queries/clientpositive/parquet_int96_timestamp.q PRE-CREATION > ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out PRE-CREATION > > Diff: https://reviews.apache.org/r/56334/diff/ > > > Testing > ------- > > qtest and unit tests added. > > > Thanks, > > Barna Zsombor Klara > > --===============7528285764965928282==--