impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3063: Separate join inversion from join ordering.
Date Wed, 17 Aug 2016 07:35:59 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-3063: Separate join inversion from join ordering.
......................................................................


Patch Set 2:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/3846/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-3063: Separate join inversion from join ordering.
> a) please add this explanation to the commit msg
a) Done.
b) Filed IMPALA-3985


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
File fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java:

Line 125:   private boolean isStraightJoin_ = false;
> nice clean-up
thanks!


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java
File fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java:

Line 81
> do we still have these checks somewhere?
These conditions seem trivially satisfied now that an ExchangeNode must be constructed with
a single child, i.e. after calling the c'tor the should be satisfied.

Note that these checks only used to be run when adding the second child - a case which is
not supported anymore.


Line 95
> what about this?
Restored it, though it's not possible to construct an exch node without a child anymore.


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/JoinNode.java
File fe/src/main/java/com/cloudera/impala/planner/JoinNode.java:

Line 478:     for (BinaryPredicate p: eqJoinConjuncts_) Collections.swap(p.getChildren(),
0, 1);
> add BinaryPredicate.reverse()?
Done


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/Planner.java
File fe/src/main/java/com/cloudera/impala/planner/Planner.java:

Line 345:    * 1. The right-hand side is a SingularRowSrcNode. We place such a node on the
> the first sentence makes it sound like it inverts *if* the rhs is already a
Agreed. Reformulated.


Line 355:   private void invertJoins(PlanNode root, boolean isSingleNodeExec) {
> explain 2nd param
Done and also renamed for clarity (single node exec was not quite accurate)


Line 360:       for (PlanNode child: root.getChildren()) {
> single line
Done


Line 372:       //    we cannot execute it efficiently.
> the same is true for right-anything if it's a broadcast. do we know we're n
1. Your comment about right-anything is correct and we will make the appropriate distribution
decision when generating the distributed plan (i.e. taking the join op constraints into account)
2. In order to execute a left null aware anti join we need to know whether the build side
contained any row with a NULL in one of the slots that participates in the hash lookup. So,
a right version of that joins is not practical, and a partitioning strategy does not really
work either.


Line 373:       if (joinNode.isStraightJoin() || joinOp.isNullAwareLeftAntiJoin()) {
> what does it mean to deal with 'straight join' here and not for the entire 
We're dealing with a plan tree here, so we don't know which query block this JoinNode originated
from. Therefore, we preserve the straight_join info when creating a JoinNode and check for
it here. Not sure if that answered your question?


Line 380:         // Always place a singular row src on the build side because it has
> has produces
Done


Line 385:         // There is no backend support for distributed non-equi right outer/semi
joins.
> this seems to contradict the condition, which triggers if !issinglenodeexec
Expanded comment. Renamed param to isLocalPlan.


PS2, Line 388: build-side hash table
> "the materialized rhs", no? (this still applies to nlj?)
Correct. Done.


Line 398:             !(joinOp.isInnerJoin() && !joinNode.getEqJoinConjuncts().isEmpty()))
{
> why can't this be applied to inner equi joins?
It can be applied and that's the intended follow-on change. My thinking is that it would be
easier to review the changes in isolation so we don't have to reason about all the plan changes
at once.


Line 404:     // Re-compute tuple ids since their order must correspond to the order of children.
> why?
Expanded comment. I think removing this BE assumption may be involved.


Line 414:   private PlanNode useNljForSmallBuilds(PlanNode root, Analyzer analyzer)
> the name doesn't really reflect the (much narrower) semantics, find a bette
Done


Line 423:     if (joinNode.getJoinOp().isNullAwareLeftAntiJoin()) return root;
> also check that this is really a hash join
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If86db7753fc585bb4c69612745ec0103278888a4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message