drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6193) Latest Calcite optimized out join condition and cause "This query cannot be planned possibly due to either a cartesian join or an inequality join"
Date Wed, 28 Feb 2018 16:57:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380661#comment-16380661

Aman Sinha commented on DRILL-6193:

[~vvysotskyi]'s suggestion seems reasonable in the near term and the change can be made in
the Drill.  I don't know how much work is involved to override the base class's filter()

It would be helpful if we can distinguish between a pure cartesian join (submitted by the
user) versus one that resulted after the RexSimplify was done. 

BTW, if the query was written with the ON clause, then the join predicate remains intact,
so this can be a workaround: 
0: jdbc:drill:zk=local> explain plan without implementation for select count(*) from cp.`tpch/nation.parquet`
n inner join cp.`tpch/region.parquet` r on n.n_nationkey = r.r_regionkey where n.n_nationkey
= 5 and r.r_regionkey 
| DrillScreenRel
  DrillAggregateRel(group=[{}], EXPR$0=[COUNT()])
      DrillJoinRel(condition=[=($0, $1)], joinType=[inner])
        DrillFilterRel(condition=[=($0, 5)])
          DrillScanRel(table=[[cp, tpch/nation.parquet]], groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=classpath:/tpch/nation.parquet]], selectionRoot=classpath:/tpch/nation.parquet,
numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`n_nationkey`]]])
        DrillFilterRel(condition=[=($0, 5)])
          DrillScanRel(table=[[cp, tpch/region.parquet]], groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=classpath:/tpch/region.parquet]], selectionRoot=classpath:/tpch/region.parquet,
numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`r_regionkey`]]]){noformat}

> Latest Calcite optimized out join condition and cause "This query cannot be planned possibly
due to either a cartesian join or an inequality join"
> --------------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: DRILL-6193
>                 URL: https://issues.apache.org/jira/browse/DRILL-6193
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization
>    Affects Versions: 1.13.0
>            Reporter: Chunhui Shi
>            Assignee: Hanumath Rao Maduri
>            Priority: Blocker
>             Fix For: 1.13.0
> I got the same error on apache master's MapR profile on the tip(before Hive upgrade)
and on changeset 9e944c97ee6f6c0d1705f09d531af35deed2e310, the last commit of Calcite upgrade
with the failed query reported in functional test but now it is on parquet file:
> FROM cp.`tpch/lineitem.parquet` L, cp.`tpch/orders.parquet` O
> WHERE cast(L.L_ORDERKEY as int) = cast(O.O_ORDERKEY as int) AND cast(L.L_LINENUMBER as
int) = 7 AND cast(L.L_ORDERKEY as int) = 10208 AND cast(O.O_ORDERKEY as int) = 10208;
>  {quote}
> However, built Drill on commit ef0fafea214e866556fa39c902685d48a56001e1, the commit
right before Calcite upgrade commits, the same query worked.
> This was caused by latest Calcite simplified the predicates and during this process,
"cast(L.L_ORDERKEY as int) = cast(O.O_ORDERKEY as int) " was considered redundant and was
removed, so the logical plan of this query is getting an always true condition for Join:
> {quote}DrillJoinRel(condition=[true], joinType=[inner])
> {quote}
> While in previous version we have 
> {quote}DrillJoinRel(condition=[=($5, $0)], joinType=[inner])
> {quote}

This message was sent by Atlassian JIRA

View raw message