impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dimitris Tsirogiannis (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5547: Rework FK/PK join detection.
Date Wed, 28 Jun 2017 22:41:22 GMT
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5547: Rework FK/PK join detection.
......................................................................


Patch Set 1:

(17 comments)

I have one comment about testing. We have lots of test changes but it's not easy to verify
if the end result makes sense or not. I think it may worth having a planner test with explain_level
> 2 that covers a few interesting join cases so that we can see what kind of pk/fk conjuncts
are detected and what the impact is on estimated join cardinalities.

http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

PS1, Line 170: else if (fkPkEqJoinConjuncts_.isEmpty()) {
             :           output.append("assumed fk/pk");
I haven't read the entire patch yet, so I am curious what this case represents.


PS1, Line 178: for (int j = 0; j < slots.size(); ++j) {
             :               output.append(slots.get(j).toString());
             :               if (j + 1 != slots.size()) output.append(", ");
             :             }
Can't you use the Guava's Joiner class here? Joiner.on(",").join(slots)


http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

PS1, Line 37: import org.apache.kudu.client.shaded.com.google.common.collect.Maps;
?


PS1, Line 271: scanSlotsByTids
nit: Each pair represents two joined tables, no? So, maybe rename this to 'scanSlotsByJoinedTids'?
A bit more explicit on what this thing represents.


PS1, Line 281: numRows
rhsTableCardinality?


Line 290: 
nit: extra line


PS1, Line 294: conjuncts
conjunct slots


PS1, Line 300:     Preconditions.checkState(joinOp_.isInnerJoin() || joinOp_.isOuterJoin());
This private function is only called in L253? I think you can remove this check.


PS1, Line 312: lhsNdv
Can this ever be 0?


PS1, Line 313: rhsNumRows
Same here, can this be 0?


PS1, Line 318: result = Math.min(result, joinCard);
That part I think is very important and I am not sure it is explained anywhere. Maybe expand
the comment in L209?


PS1, Line 329: conjuncts
conjunct slots


PS1, Line 341: slots.lhs().getStats().getNumDistinctValues();
nit: these are used in couple of places. Since you already have the EqJoinConjunctScanSlots
class, why don't your wrap them in some nice util functions there (e.g. getLhsNdvs() or something
like that).


PS1, Line 343: slots.lhs().getParent().getTable().getNumRows();
same comment as above.


PS1, Line 373: Preconditions
I don't think these checks are useful. This private constructor is only called through the
static function that has tight control on what gets passed in this ctor.


PS1, Line 382: .
nit: remove


PS1, Line 399: tupleDesc
nit: inline


-- 
To view, visit http://gerrit.cloudera.org:8080/7257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49074fe743a28573cff541ef7dbd0edd88892067
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message