drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-1150) Sub-optimal expression pushdown for slightly modified version of Tpch 19
Date Wed, 16 Jul 2014 16:58:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063712#comment-14063712
] 

Aman Sinha commented on DRILL-1150:
-----------------------------------

Corrections:  the 2 Projects are above the Part scan, not Lineitem.  The expressions pushdown
occurs on both sides of the join...for Lineitem and Part. 

> Sub-optimal expression pushdown for slightly modified version of Tpch 19
> ------------------------------------------------------------------------
>
>                 Key: DRILL-1150
>                 URL: https://issues.apache.org/jira/browse/DRILL-1150
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Aman Sinha
>            Assignee: Jinfeng Ni
>
> A slightly modified version of TPCH 19, called 19_1 in the TestTpchDistributed JUnit
test suite produces the following plan on latest master version 699851b.   The plan shows
several expressions pushed into the Project just above the Lineitem scan whereas these expressions
should ideally be evaluated after the join since there is no need to evaluate the expression
for a row that does not qualify the join.   Also notice that there are 2 Projects above the
Lineitem scan...these should have been merged into one. 
> | 00-00    Screen
> 00-01      StreamAgg(group=[{}], revenue=[SUM($0)])
> 00-02        Project($f0=[*($2, -(1, $3))])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM CASE'), =($14,
'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, $17, OR(=($0, 'AIR'), =($0,
'AIR REG')), =($6, 'DELIVER IN PERSON')), AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14,
'MED BOX'), =($14, 'MED PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0,
'AIR REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG CASE'), =($14,
'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, $22, $23, OR(=($0, 'AIR'), =($0,
'AIR REG')), =($12, 'DELIVER IN PERSON')))])
> 00-05              HashJoin(condition=[=($1, $13)], joinType=[inner])
> 00-07                Project(l_shipmode=[$5], l_partkey=[$4], l_extendedprice=[$3], l_discount=[$1],
$f7=[>=($2, 2)], $f8=[<=($2, +(2, 10))], $f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1"
COLLATE "ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], $f12=[CAST($0):CHAR(17)
CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2,
+(23, 10))], $f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"])
> 00-09                  ProducerConsumer
> 00-11                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=/tpch/lineitem.parquet]], selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath
[`l_shipmode`], SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath [`l_discount`],
SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
> 00-06                Project(p_partkey=[$0], p_container=[$1], $f5=[$2], $f6=[$3], $f70=[$4],
$f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], $f120=[$9], $f130=[$10])
> 00-08                  Project(p_partkey=[$2], p_container=[$3], $f5=[CAST($1):CHAR(8)
CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0,
5)], $f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"],
$f9=[>=($0, 1)], $f10=[<=($0, 10)], $f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1"
COLLATE "ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
> 00-10                    ProducerConsumer
> 00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
[path=/tpch/part.parquet]], selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`],
SchemaPath [`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message