impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3210: last/first value() support for IGNORE NULLS
Date Wed, 06 Jul 2016 17:15:57 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3210: last/first_value() support for IGNORE NULLS
......................................................................


Patch Set 5:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/3328/5/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS5, Line 1371: // TODO: Rename to something like 'SetVal' so its more natural to use with
other UDAs.
> would you mind doing this?
Done


PS5, Line 1454: src
> Call with T::null() to be explicit, maybe use the same comment you have bel
Done


PS5, Line 1470: We c
> remove We
Done


PS5, Line 1471:  for us
> remove for us
Done


PS5, Line 1515: state->last_val.ptr
> only do this if last_val is not null
Done


PS5, Line 1547:  we encounter.
> encountered.
Done


PS5, Line 1555: we encounter
> encountered
Done


http://gerrit.cloudera.org:8080/#/c/3328/5/be/src/exprs/aggregate-functions.h
File be/src/exprs/aggregate-functions.h:

PS5, Line 251: Requires a start bound of UNBOUNDED PRECEDING.
> I think you can leave this out here, it's not particularly actionable to pe
Unless I'm missing something, the functions don't have access to the window, so we couldn't
put a DCHECK in Update(), unless we add the window to the FunctionContext. Or we could put
a the DCHECK in AnalyticEvalNode::Init. 
At the very least, I think it should be documented somewhere, even if its not right here.


http://gerrit.cloudera.org:8080/#/c/3328/5/fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java:

PS5, Line 553: function
> ... in an equivalent execution. Only applies to analytic functions with a w
Done


PS5, Line 555: reverse
> Maybe rename to reverseWindowFn to better indicate this only applies to fun
Done


Line 570:       fnCall_.analyzeNoThrow(analyzer);
> // Use getType() instead if getReturnType() because wildcard decimals
Done


PS5, Line 589: '_ignore_nulls"
> inconsistent quote/double quotes
Done


PS5, Line 590: so that the BE doesn't have to know about
             :    *    IGNORE NULLS
> because the BE implements them as different functions.
Done


PS5, Line 661:       // Use getType() instead of getReturnType() because wildcard decimals
             :       // have only been resolved in the former.
             :       // TODO: Is this necessary?
             :       Preconditions.checkState(type_.equals(fnCall_.getType()));
> Can you put this in reverse() after we analyze the new fnCall_ ? I think it
Done


http://gerrit.cloudera.org:8080/#/c/3328/5/fe/src/main/java/com/cloudera/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/FunctionCallExpr.java:

PS5, Line 468: IGNORE NULLS may only be used with analytic functions.
> Can you just say this function doesn't support IGNORE NULLS rather than say
Done


http://gerrit.cloudera.org:8080/#/c/3328/5/fe/src/main/java/com/cloudera/impala/catalog/BuiltinsDb.java
File fe/src/main/java/com/cloudera/impala/catalog/BuiltinsDb.java:

PS5, Line 967: createAnalyticBuiltin
> can you call the overload which has the bool parameter at the end and set i
Done


http://gerrit.cloudera.org:8080/#/c/3328/5/fe/src/test/java/com/cloudera/impala/planner/PlannerTest.java
File fe/src/test/java/com/cloudera/impala/planner/PlannerTest.java:

Line 57:     runPlannerTestFile("analytic-fns-ignore-nulls");
> Any reason these are separate files/tests? If not, can you merge them to re
Done


http://gerrit.cloudera.org:8080/#/c/3328/5/testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
File testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test:

PS5, Line 743: ASC NULLS LAST
> I don't think this is necessarily an issue, but it's weird to me that now t
The reason that this happens is that when OrderByElement.reverse creates new OrderByElements
it always assigns either true or false to 'nullsFirst', even if the original OrderByElement's
'nullsFirst' was NULL.

As you say, though, "NULLS LAST" is the default for ASC, and this plan is the same from the
perspective of the backend as an equivalent plan where "NULLS LAST" wasn't explicit, so I
don't think it should matter.


PS5, Line 1545: 07:ANALYTIC
              : |  functions: first_value(bigint_col)
              : |  partition by: tinyint_col
              : |  order by: id ASC
              : |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
              : |
              : 06:ANALYTIC
              : |  functions: first_value(bigint_col)
              : |  partition by: tinyint_col
              : |  order by: id ASC
              : |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
              : |
              : 05:ANALYTIC
              : |  functions: first_value(int_col)
              : |  partition by: tinyint_col
              : |  order by: id ASC, bool_col DESC
> One of the simplifications we made does result in less grouping than we had
Its true that we get less grouping in this particular situation, and that in general this
patch will result in more types of windows being used since its less aggressive about standardizing
things (assuming that all possible input windows are used evenly, which is probably not true).

But, its not entirely clear that the changes that were made will result in less grouping overall,
since first_value_rewrite couldn't be grouped with anything else (as enforced by AnalyticPlanner.requiresIndependentEval,
which this patch removes).

It really depends on what combinations of bounds are more or less common.


-- 
To view, visit http://gerrit.cloudera.org:8080/3328
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic27525e2237fb54318549d2674f1610884208e9b
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message