impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5612: join inversion should factor in parallelism
Date Fri, 07 Jul 2017 23:32:48 GMT
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5612: join inversion should factor in parallelism
......................................................................


Patch Set 2:

(5 comments)

I have some minor comments, the patch looks ok to me otherwise.

http://gerrit.cloudera.org:8080/#/c/7351/2/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 386:    *    cardinality*avgSerializedSize. Do not invert if relevant stats are missing.
Update comment to add the 4th case?


PS2, Line 424: invertedJoinIsCheaper
nit:isInvertedJoinCheaper()?


PS2, Line 459: (log_b(rhsBytes) + C) * (lhsCard + 2 * rhsCard)
What is the unit of this? In other words what exactly are we trying to optimize per node?
Based on the last point above, I thought it would look something like,

((log_b(rhsBytes) * lhsCard) + 2 * rhsCard) * C

My understanding was more like, for each probe row, we look up the hash table (= ~(log_b(rhsBytes)
* lhsCard)) and 2 * rhsCard for building the hash table and C is the fixed cost. Am I missing
something?


Line 488:     final long CONSTANT_COST_PER_ROW = 5;
How was this chosen?


PS2, Line 491: log10
Shouldn't this be base 2? Don't think it matters as long as we use same for both the cases,
but just wondering.


-- 
To view, visit http://gerrit.cloudera.org:8080/7351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icacea4565ce25ef15aaab014684c9440dd501d4e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message