hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brock Noland" <br...@cloudera.com>
Subject Re: Review Request 30337: HIVE-9482 : Hive parquet timestamp compatibility
Date Wed, 28 Jan 2015 22:15:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30337/#review70096
-----------------------------------------------------------

Ship it!


Ship It!

- Brock Noland


On Jan. 28, 2015, 8:10 p.m., Szehon Ho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30337/
> -----------------------------------------------------------
> 
> (Updated Jan. 28, 2015, 8:10 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-9482
>     https://issues.apache.org/jira/browse/HIVE-9482
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> In current Hive implementation, timestamps are stored in UTC (converted from current
timezone), based on original parquet timestamp spec.
> However, we find this is not compatibility with other tools, and after some investigation
it is not the way of the other file formats, or even some databases (Hive Timestamp is more
equivalent of 'timestamp without timezone' datatype).
> 
> This is the first part of the fix, which will restore compatibility with parquet-timestamp
files generated by external tools by skipping conversion on reading.
> 
> Later fix will change the write path to not convert, and stop the read-conversion even
for files written by Hive itself.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 64e7e0a 
>   data/files/parquet_external_time.parq PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ConverterParent.java a86d6f4

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableRecordConverter.java
000e8ea 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 23bb364

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveCollectionConverter.java
872900b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java 11772be

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveStructConverter.java eeb3838

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/Repeated.java af28b4c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
3f8e4d7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
4e4d7fd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java c647b24

>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c

>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java
2e788bd 
>   ql/src/test/queries/clientpositive/parquet_external_time.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_external_time.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30337/diff/
> 
> 
> Testing
> -------
> 
> Added new unit tests (TestParquetTimestampUtils) to test non-conversion code-path.
> 
> Also added new q-test, to read a parquet timestamp-file generated by an external tool,
in this case Impala.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message