impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3063: Separate join inversion from join ordering.
Date Wed, 17 Aug 2016 07:36:14 GMT
Alex Behm has uploaded a new patch set (#4).

Change subject: IMPALA-3063: Separate join inversion from join ordering.
......................................................................

IMPALA-3063: Separate join inversion from join ordering.

Before this change joins were inverted while doing join ordering.
That approach was unnecessarily complex because it required
modifying the global analysis state for correct conjunct
placement, etc. However, join inversion is independent of join
ordering, and the existing approach could lead to generating
invalid plans with distributed non-equi right outer/semi joins,
which we cannot execute in the backend.

After this change joins are inverted in a separate pass over
the single-node plan. This simplifies the inversion
logic and allows us to avoid generating those invalid plans.

Note that this change is not only a separation of functionality
for the following reasons:
1. Our join cardinality estimation is not symmetric, i.e., A JOIN B
may not give the same estimate as B JOIN A due to our FK/PK detection
heuristic. In the context of this patch this means that an inverted
join may have a different cardinality estimate, so plans may change
depending on whether the inversion is done during join ordering of after.
2. We currently only invert outer/semi/anti joins based on the rhs table
ref join op. In this patch I want to preserve the existing behavior as
much as possible, but when doing the join ordering in a separate pass we
may see a join opn in a JoinNode that is different from the rhs table ref.
So in some situations the inversion behavior based on the join op could be
different and there are some examples in this patch.

This patch also moves the logic of converting hash joins to
nested-loop joins into a separate pass over the single-node plan.

Change-Id: If86db7753fc585bb4c69612745ec0103278888a4
---
M fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
M fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
M fe/src/main/java/com/cloudera/impala/analysis/SelectList.java
M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java
M fe/src/main/java/com/cloudera/impala/planner/AggregationNode.java
M fe/src/main/java/com/cloudera/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java
M fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java
M fe/src/main/java/com/cloudera/impala/planner/HdfsScanNode.java
M fe/src/main/java/com/cloudera/impala/planner/JoinNode.java
M fe/src/main/java/com/cloudera/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/com/cloudera/impala/planner/PlanNode.java
M fe/src/main/java/com/cloudera/impala/planner/Planner.java
M fe/src/main/java/com/cloudera/impala/planner/SelectNode.java
M fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java
M fe/src/main/java/com/cloudera/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/com/cloudera/impala/planner/SortNode.java
M fe/src/main/java/com/cloudera/impala/planner/SubplanNode.java
M fe/src/main/java/com/cloudera/impala/planner/UnionNode.java
M fe/src/main/java/com/cloudera/impala/planner/UnnestNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
28 files changed, 592 insertions(+), 476 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/3846/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If86db7753fc585bb4c69612745ec0103278888a4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>

Mime
View raw message