impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4792: Fix number of distinct values for a CASE with constant outputs
Date Fri, 27 Jan 2017 06:01:16 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4792: Fix number of distinct values for a CASE with constant outputs
......................................................................


Patch Set 2:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/5768/2/fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
File fe/src/main/java/org/apache/impala/analysis/CaseExpr.java:

Line 384:     // Otherwise, take a max over all the outputs as well as all the slotrefs that
Why would we consider the slotrefs of the input? They don't affect the NDV of the output.


Line 388:     boolean allOutputsKnown = true;
Why not put this new code into an override of computeNumDistinctValues()?


Line 390:     long    constOutputVals = 0;
formatting: long constOutputVals (just one space after type)


Line 392:     long    maxOutputVals = -1;
maxNonConstNdv? add comment describing what this is


Line 400:       // look at the when is to get an idea of the number of input rows, so
I'm a little confused by this. The input of the when clauses won't tell us anything about
the number of input rows. Do you mwan input NDV? Why would we even consider the when exprs
for computing the NDV? Are you trying to determine how likely a when expr is going to be true?


Line 415:           if (!constLiteralSet.contains(thenLiteral)) {
you can add() and check the return value to see whether the item was already contained


Line 424: 
remove bank line


Line 425:         if (thenNumDistinct == -1) {
single line


Line 432:     if (hasElseExpr_) {
good bit of common code between the handing of this else expr and the then exprs above, please
factor out common code (e.g. with a helper function, or by first assembling a list of all
relevant exprs)


Line 460:         // All must be constant, because if we hit any slot, this would be set
slot -> SlotRef


Line 466:       numDistinctValues_ = Math.max(maxInputVals, maxOutputVals);
Still not following this part. Why are the maxInputVals relevant here?


http://gerrit.cloudera.org:8080/#/c/5768/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 304:       // get the max number of distinct values over all the children of this node
sufficient to write:
"... over all children"


Line 308:         // only the SlotRefs, except that it allows a Expr to override the values
nit: an Expr


-- 
To view, visit http://gerrit.cloudera.org:8080/5768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21dbdaad8452b7e58c477612b47847dccd9d98d2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message