drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet
Date Wed, 22 Mar 2017 22:44:41 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937305#comment-15937305
] 

Rahul Challapalli commented on DRILL-4373:
------------------------------------------

[~knguyen] I am wondering how this timestamp column would work in the below scenarios :

1. Generate a metadata cache on top of file which contains hive generated timestamp, what
would happen? how would the contents of the cache file look etc?
2. Running drill native parquet reader on top of hive tables which have timestamp data types?
3. Running drill timestamp functions after converting hive timestamps using IMPALA_TIMESTAMP_LOCALTIMEZONE
function
4. Running IMPALA_TIMESTAMP_LOCALTIMEZONE function in a view

> Drill and Hive have incompatible timestamp representations in parquet
> ---------------------------------------------------------------------
>
>                 Key: DRILL-4373
>                 URL: https://issues.apache.org/jira/browse/DRILL-4373
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive, Storage - Parquet
>    Affects Versions: 1.8.0
>            Reporter: Rahul Challapalli
>            Assignee: Vitalii Diravka
>              Labels: doc-impacting
>             Fix For: 1.10.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a hive table
on top of the parquet file and use "timestamp" as the column type, drill fails to read the
hive table through the hive storage plugin
> Implementation: 
> Added int96 to timestamp converter for both parquet readers and controling it by system
/ session option "store.parquet.int96_as_timestamp".
> The value of the option is false by default for the proper work of the old query scripts
with the "convert_from TIMESTAMP_IMPALA" function.
> When the option is true using of that function is unnesessary and can lead to the query
fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message