drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT
Date Tue, 05 Jul 2016 16:00:12 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362679#comment-15362679
] 

Paul Rogers commented on DRILL-4764:
------------------------------------

Note that these failures seem to be by design: I've learned that Drill does not provide support
Parquet logical types. That is, the use of logical types is supposed to not work...

Still, if logical types are being used by writers of Parquet files, Drill, as a member in
good standing of the "Hadoop and Friends" ecosystem, should be able to read those files. Especially
in the trivial cases where the logical types add no information. That is, if Drill always
treats int32 storage types as 32-bit signed ints, then the types int_8, uint_8, int_16, uint_16
and int_32 add no information and are benign.

> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-4764
>                 URL: https://issues.apache.org/jira/browse/DRILL-4764
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.6.0
>            Reporter: Paul Rogers
>         Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, uint_32.parquet,
uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException:
unsupported type: INT32 INT_16 Fragment 0:0 [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4
on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file that the data
is actually a 16-bit signed int. Presumably, this should tell Drill to use the SmallIntVector
(or NullableSmallIntVector) class for storage. Without supporting this annotation, even 16-bit
integers must be stored as 32-bits within Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message