drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3788) Partition Pruning not taking place with metadata caching when we have ~20k files
Date Wed, 16 Sep 2015 02:02:46 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746670#comment-14746670

Aman Sinha commented on DRILL-3788:

That is true..in this case the data was probably created with the hierarchical directory structure.
  [~rkins] can you run both types of tests ? 
i.e (1) CTAS auto partitioning with partitioning column 'x' ,  run metadata refresh, add a
new set of auto-partitioned files to the same directory and then run your query with filter
on 'x'. 
(2) Create the hierarchical directory structure, refresh metadata and query with filters on
these directories.    (it sounds like you are doing this). 

> Partition Pruning not taking place with metadata caching when we have ~20k files
> --------------------------------------------------------------------------------
>                 Key: DRILL-3788
>                 URL: https://issues.apache.org/jira/browse/DRILL-3788
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.2.0
>            Reporter: Rahul Challapalli
>            Assignee: Aman Sinha
>            Priority: Critical
>             Fix For: 1.2.0
>         Attachments: plan.txt
> git.commit.id.abbrev=240a455
> Partition Pruning did not take place for the below query after I executed the "refresh
table metadata command"
> {code}
>  explain plan for 
> select
>   l_returnflag,
>   l_linestatus
> from
>   `lineitem/2006/1`
> where
>   dir0=1 or dir0=2
> {code}
> The logs did not indicate that "pruning did not take place"
> Before executing the refresh table metadata command, partition pruning did take effect
> I am not attaching the data set as it is larger than 10MB. Reach out to me if you need
more information

This message was sent by Atlassian JIRA

View raw message