impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3063: Separate join inversion from join ordering.
Date Wed, 17 Aug 2016 07:35:59 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-3063: Separate join inversion from join ordering.

Patch Set 2:

Commit Message:

Line 7: IMPALA-3063: Separate join inversion from join ordering.
> a) please add this explanation to the commit msg
a) Done.
b) Filed IMPALA-3985
File fe/src/main/java/com/cloudera/impala/analysis/

Line 125:   private boolean isStraightJoin_ = false;
> nice clean-up
File fe/src/main/java/com/cloudera/impala/planner/

Line 81
> do we still have these checks somewhere?
These conditions seem trivially satisfied now that an ExchangeNode must be constructed with
a single child, i.e. after calling the c'tor the should be satisfied.

Note that these checks only used to be run when adding the second child - a case which is
not supported anymore.

Line 95
> what about this?
Restored it, though it's not possible to construct an exch node without a child anymore.
File fe/src/main/java/com/cloudera/impala/planner/

Line 478:     for (BinaryPredicate p: eqJoinConjuncts_) Collections.swap(p.getChildren(),
0, 1);
> add BinaryPredicate.reverse()?
File fe/src/main/java/com/cloudera/impala/planner/

Line 345:    * 1. The right-hand side is a SingularRowSrcNode. We place such a node on the
> the first sentence makes it sound like it inverts *if* the rhs is already a
Agreed. Reformulated.

Line 355:   private void invertJoins(PlanNode root, boolean isSingleNodeExec) {
> explain 2nd param
Done and also renamed for clarity (single node exec was not quite accurate)

Line 360:       for (PlanNode child: root.getChildren()) {
> single line

Line 372:       //    we cannot execute it efficiently.
> the same is true for right-anything if it's a broadcast. do we know we're n
1. Your comment about right-anything is correct and we will make the appropriate distribution
decision when generating the distributed plan (i.e. taking the join op constraints into account)
2. In order to execute a left null aware anti join we need to know whether the build side
contained any row with a NULL in one of the slots that participates in the hash lookup. So,
a right version of that joins is not practical, and a partitioning strategy does not really
work either.

Line 373:       if (joinNode.isStraightJoin() || joinOp.isNullAwareLeftAntiJoin()) {
> what does it mean to deal with 'straight join' here and not for the entire 
We're dealing with a plan tree here, so we don't know which query block this JoinNode originated
from. Therefore, we preserve the straight_join info when creating a JoinNode and check for
it here. Not sure if that answered your question?

Line 380:         // Always place a singular row src on the build side because it has
> has produces

Line 385:         // There is no backend support for distributed non-equi right outer/semi
> this seems to contradict the condition, which triggers if !issinglenodeexec
Expanded comment. Renamed param to isLocalPlan.

PS2, Line 388: build-side hash table
> "the materialized rhs", no? (this still applies to nlj?)
Correct. Done.

Line 398:             !(joinOp.isInnerJoin() && !joinNode.getEqJoinConjuncts().isEmpty()))
> why can't this be applied to inner equi joins?
It can be applied and that's the intended follow-on change. My thinking is that it would be
easier to review the changes in isolation so we don't have to reason about all the plan changes
at once.

Line 404:     // Re-compute tuple ids since their order must correspond to the order of children.
> why?
Expanded comment. I think removing this BE assumption may be involved.

Line 414:   private PlanNode useNljForSmallBuilds(PlanNode root, Analyzer analyzer)
> the name doesn't really reflect the (much narrower) semantics, find a bette

Line 423:     if (joinNode.getJoinOp().isNullAwareLeftAntiJoin()) return root;
> also check that this is really a hash join

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: If86db7753fc585bb4c69612745ec0103278888a4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-HasComments: Yes

View raw message