drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3380) CTAS Auto Partitioning : We are not pruning when we use functions in the select list
Date Fri, 26 Jun 2015 00:35:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602185#comment-14602185
] 

Rahul Challapalli commented on DRILL-3380:
------------------------------------------

Even in the first case where we are actually pruning, I do not understand why we have 2 Projects
in the plan?

> CTAS Auto Partitioning : We are not pruning when we use functions in the select list
> ------------------------------------------------------------------------------------
>
>                 Key: DRILL-3380
>                 URL: https://issues.apache.org/jira/browse/DRILL-3380
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Rahul Challapalli
>            Assignee: Steven Phillips
>            Priority: Critical
>
> git.commit.id.abbrev=5a34d81
> I used the below query to create a paritioned data set
> {code}
> create table `lineitem` partition by (l_moddate) as select l.*, l_shipdate - extract(day
from l_shipdate) + 1 l_moddate from cp.`tpch/lineitem.parquet` l;
> {code}
> The plan for the below query only scans one file
> {code}
> explain plan for select * from `lineitem` where l_moddate = date '1994-07-01';
>  00-00    Screen
> 00-01      Project(*=[$0])
> 00-02        Project(*=[$0])
> 00-03          Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem/0_0_31.parquet]],
selectionRoot=/drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem, numFiles=1,
columns=[`*`]]])
> {code}
> However the below plan indicates a full table scan
> {code}
> explain plan for select count(*) from `tpch_single_partition/lineitem` where l_moddate
= date '1994-07-01';
> 00-00    Screen
> 00-01      StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-02        Project($f0=[0])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[=($0, 1994-07-01)])
> 00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem]],
selectionRoot=/drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem, numFiles=1,
columns=[`l_moddate`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message