drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Parth Chandra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6223) Drill fails on Schema changes
Date Fri, 16 Mar 2018 17:25:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402223#comment-16402223
] 

Parth Chandra commented on DRILL-6223:
--------------------------------------

Schema change for Parquet files is not supported by the Parquet metadata cache. The Parquet
metadata cache overwrites the schema if it changes (does not merge) and so the last one encountered
is the schema selected. New columns added are OK, I think, but type changes are not.

See [1].

I haven't looked at the PR, but you might want to test this out with the metadata cache enabled.

[1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L420

> Drill fails on Schema changes 
> ------------------------------
>
>                 Key: DRILL-6223
>                 URL: https://issues.apache.org/jira/browse/DRILL-6223
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.10.0, 1.12.0
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Drill Query Failing when selecting all columns from a Complex Nested Data File (Parquet)
Set). There are differences in Schema among the files:
>  * The Parquet files exhibit differences both at the first level and within nested data
types
>  * A select * will not cause an exception but using a limit clause will
>  * Note also this issue seems to happen only when multiple Drillbit minor fragments are
involved (concurrency higher than one)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message