drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4070) Metadata Caching : min/max values are null for varchar columns in auto partitioned data
Date Thu, 12 Nov 2015 17:28:11 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002491#comment-15002491
] 

Aman Sinha commented on DRILL-4070:
-----------------------------------

I agree with not maintaing our Parquet fork but we should provide a migration path.  From
Jason's update it sounds like in this case just adding version number to the footers of existing
files would work but any such utility would still need testing.  Some users have already created
hundreds of thousands of files of auto-partitioned data on varchar columns. 

> Metadata Caching : min/max values are null for varchar columns in auto partitioned data
> ---------------------------------------------------------------------------------------
>
>                 Key: DRILL-4070
>                 URL: https://issues.apache.org/jira/browse/DRILL-4070
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.3.0
>            Reporter: Rahul Challapalli
>            Priority: Critical
>         Attachments: cache.txt, fewtypes_varcharpartition.tar.tgz
>
>
> git.commit.id.abbrev=e78e286
> The metadata cache file created contains incorrect values for min/max fields for varchar
colums. The data is also partitioned on the varchar column
> {code}
> refresh table metadata fewtypes_varcharpartition;
> {code}
> As a result partition pruning is not happening. This was working after DRILL-3937 has
been fixed (d331330efd27dbb8922024c4a18c11e76a00016b)
> I attached the data set and the cache file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message