drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vitalii Diravka (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5097) Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works
Date Wed, 14 Dec 2016 14:17:59 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15748427#comment-15748427
] 

Vitalii Diravka commented on DRILL-5097:
----------------------------------------

Agree. I miss the thing that when the int96 value is converted into timestamp (long originally)
we cut [the nanos precision to millis|https://github.com/apache/drill/blob/04fb0be191ef09409c00ca7173cb903dfbe2abb0/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java#L337].
The solution of this issue is to change dataTypeLengthInBits of columnReader while converting
parquet fixed binary type INT96 into drill TimeStamp.

> Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works
> -----------------------------------------------------------------------------------
>
>                 Key: DRILL-5097
>                 URL: https://issues.apache.org/jira/browse/DRILL-5097
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Codegen
>    Affects Versions: 1.9.0
>            Reporter: Vitalii Diravka
>            Assignee: Vitalii Diravka
>              Labels: FreeMarker, TimestampVector
>             Fix For: Future
>
>         Attachments: data.snappy.parquet
>
>
> Need to change the default width of the timestamp vector from 8 byte to 12, since timestamp
values can hold a new INT96 primitive type. 
> Use case for it:
> Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works.

> The below query succeeds:
> {code}
> select c, convert_from(d, 'TIMESTAMP_IMPALA') from dfs.`/drill/testdata/parquet_timestamp/spark_generated/d3`;
> {code}
> The below query fails:
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set `store.parquet.reader.int96_as_timestamp`
= true;
> +-------+---------------------------------------------------+
> |  ok   |                      summary                      |
> +-------+---------------------------------------------------+
> | true  | store.parquet.reader.int96_as_timestamp updated.  |
> +-------+---------------------------------------------------+
> 1 row selected (0.231 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select c, d from dfs.`/drill/testdata/parquet_timestamp/spark_generated/d3`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 131076 (expected:
0 <= readerIndex <= writerIndex <= capacity(131072))
> Fragment 0:0
> [Error Id: bd94f477-7c01-420f-8920-06263212177b on qa-node190.qa.lab:31010] (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message