drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2568) New partition pruning prevents the optimization for trivial COUNT(*) queries
Date Thu, 26 Mar 2015 02:35:53 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381248#comment-14381248
] 

Jacques Nadeau commented on DRILL-2568:
---------------------------------------

Can you please post review board?

> New partition pruning prevents the optimization for trivial COUNT(*) queries
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-2568
>                 URL: https://issues.apache.org/jira/browse/DRILL-2568
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 0.8.0
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>         Attachments: 0001-DRILL-2568-Drop-filter-plan-node-if-all-conjuncts-ha.patch
>
>
> With the new interpreter based partition pruning,  if the query has only partition filters
and they are pushed into the Scan, we don't drop the Filter node from the plan. This prevents
the optimization for COUNT(*) queries against parquet files where we read the count values
directly from the parquet files instead of scanning and aggregating.  The ConvertCountToDirectScan
rule does not get applied if there is an intervening Filter between the Scan and the Aggregate
nodes.  
> {code}
> 0: jdbc:drill:zk=local> explain plan for select count(*) from dfs.`/Users/asinha/data/multilevel/parquet`
where dir0=1995;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-02        Project($f0=[0])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[=($0, 1995)])
> 00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q1/orders_95_q1.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q2/orders_95_q2.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q3/orders_95_q3.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q4/orders_95_q4.parquet]],
selectionRoot=/Users/asinha/data/multilevel/parquet, numFiles=4, columns=[`dir0`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message