From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-2568) New partition pruning prevents the optimization for trivial COUNT(*) queries
Date Wed, 25 Mar 2015 22:19:53 GMT
Aman Sinha created DRILL-2568:

             Summary: New partition pruning prevents the optimization for trivial COUNT(*)
                 Key: DRILL-2568
                 URL: https://issues.apache.org/jira/browse/DRILL-2568
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 0.8.0
            Reporter: Aman Sinha
            Assignee: Aman Sinha

With the new interpreter based partition pruning,  if the query has only partition filters
and they are pushed into the Scan, we don't drop the Filter node from the plan. This prevents
the optimization for COUNT(*) queries against parquet files where we read the count values
directly from the parquet files instead of scanning and aggregating.  The ConvertCountToDirectScan
rule does not get applied if there is an intervening Filter between the Scan and the Aggregate

0: jdbc:drill:zk=local> explain plan for select count(*) from dfs.`/Users/asinha/data/multilevel/parquet`
where dir0=1995;
|    text    |    json    |
| 00-00    Screen
00-01      StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02        Project($f0=[0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[=($0, 1995)])
00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q1/orders_95_q1.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q2/orders_95_q2.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q3/orders_95_q3.parquet],
ReadEntryWithPath [path=file:/Users/asinha/data/multilevel/parquet/1995/Q4/orders_95_q4.parquet]],
selectionRoot=/Users/asinha/data/multilevel/parquet, numFiles=4, columns=[`dir0`]]])

