drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinfeng Ni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3383) CTAS Auto Partitioning : We are adding an additional project to the plan
Date Fri, 26 Jun 2015 15:16:05 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603024#comment-14603024
] 

Jinfeng Ni commented on DRILL-3383:
-----------------------------------

The additional project, although not optimal, would not lead to non-trial performance overhead.
For the redundant project operator, when it does not have any evaluation ( which is the case
in the plan you got, since the projected expression is same as the input), Drill's execution
only need to transfer the buffer from incoming batch to the outgoing batch, which would not
cause big performance overhead.

Given that, I change from "Major" to "Minor".

 

> CTAS Auto Partitioning : We are adding an additional project to the plan
> ------------------------------------------------------------------------
>
>                 Key: DRILL-3383
>                 URL: https://issues.apache.org/jira/browse/DRILL-3383
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Rahul Challapalli
>            Assignee: Jinfeng Ni
>            Priority: Minor
>             Fix For: 1.1.0
>
>
> git.commit.id.abbrev=5a34d81
> I used the below query to create a paritioned data set
> {code}
> create table `lineitem` partition by (l_moddate) as select l.*, l_shipdate - extract(day
from l_shipdate) + 1 l_moddate from cp.`tpch/lineitem.parquet` l;
> {code}
> The plan for the below query has 2 projects instead of 1
> {code}
> explain plan for select * from `lineitem` where l_moddate = date '1994-07-01';
>  00-00    Screen
> 00-01      Project(*=[$0])
> 00-02        Project(*=[$0])
> 00-03          Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem/0_0_31.parquet]],
selectionRoot=/drill/testdata/ctas_auto_partition/tpch_single_partition/lineitem, numFiles=1,
columns=[`*`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message