drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3765) Partition prune rule is unnecessary fired multiple times.
Date Tue, 17 Nov 2015 01:27:10 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007761#comment-15007761
] 

Jinfeng Ni commented on DRILL-3765:
-----------------------------------

I should point out that the improvement from this patch depends on how much time is spent
on evaluation of pruning filter in the rule in the overall planning time.  The more complex
pruning filter we have, the more likely we would see improvements.

For example, in the previous comment, I compared the planning time for a pruning filter with
a 5 value in-list, (5.2 seconds vs 9.4 seconds).  If the pruning filter is changed to one
= condition, then the planning time is changed to 3.7 seconds vs 4.6 seconds. That is, we
would see smaller improvement with simpler pruning filter, which seems to be reasonable. 

{code}
explain plan for select ss_sold_date_sk, ss_sold_time_sk, ss_item_sk, ss_customer_sk from
dfs.tmp.store_pb_item_sk where ss_item_sk =100  and ss_customer_sk = 96479;

1 row selected (3.709 seconds)

alter session set `planner.enable_hep_partition_pruning` = false;

1 row selected (4.65 seconds)
{code}


> Partition prune rule is unnecessary fired multiple times. 
> ----------------------------------------------------------
>
>                 Key: DRILL-3765
>                 URL: https://issues.apache.org/jira/browse/DRILL-3765
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>
> It seems that the partition prune rule may be fired multiple times, even after the first
rule execution has pushed the filter into the scan operator. Since partition prune has to
build the vectors to contain the partition /file / directory information, to invoke the partition
prune rule unnecessary may lead to big memory overhead.
> Drill planner should avoid the un-necessary partition prune rule, in order to reduce
the chance of hitting OOM exception, while the partition prune rule is executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message