drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-4762) Parquet file with INT_32 column fails in simple SELECT
Date Mon, 04 Jul 2016 18:04:11 GMT

     [ https://issues.apache.org/jira/browse/DRILL-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Paul Rogers updated DRILL-4762:
    Attachment: date.parquet

Parquet file created with the schema and values described in the bug. The first and last values
are of primary interest, the other values simply probe interesting values.

> Parquet file with INT_32 column fails in simple SELECT
> ------------------------------------------------------
>                 Key: DRILL-4762
>                 URL: https://issues.apache.org/jira/browse/DRILL-4762
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.7.0
>            Reporter: Paul Rogers
>         Attachments: date.parquet, date.parquet, int32.parquet, int_32.parquet
> Create a Parquet file with the following schema:
> message int32Data { required int32 index; required int32 value (INT_32); }
> See attached file int_32.parquet.
> Query it as a local file using the web UI as follows:
> SELECT * from `local`.`root`.`int_32.parquet`;
> The following error is reported:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException:
unsupported type: INT32 INT_32 Fragment 0:0 [Error Id: 79fdbc5d-2c69-47bd-a8a5-28939546e13d
> This message suggests that the Parquet logical (or "original") type of signed INT_32
is not supported. Logical types are important because the storage type (int32) simply says
how to store the data, the logical type says how to interpret that data. In this case, the
logical type is identical to the storage type: a 32-bit signed integer.
> Strangely, the exact same file, without the logical type, works:
> message int32Data { required int32 index; required int32 value; }
> Creates file int32.parquet (attached). Queried with:
> SELECT * from `local`.`root`.`int32.parquet`;
> Produces the expected 5 rows of output. (Values are 0, -1, 1, min int and max int).
> Expected Drill to support all Parquet logical types (or at least those on top of the
scalar types.)

This message was sent by Atlassian JIRA

View raw message