impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Date Wed, 26 Oct 2016 01:18:39 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

Patch Set 9:


Rebased to pick up IMPALA-3884
File be/src/codegen/

PS7, Line 459: 
> that
File be/src/codegen/codegen-anyval.h:

Line 184:   /// lowered type. This *Val should be non-null. The output variable is called
> nit: space
File be/src/codegen/llvm-codegen.h:

PS7, Line 347: 
             :   /// TODO: Return Status inst
> Avoid cloning if possible to reduce compilation time.
File be/src/exec/

PS7, Line 1660: ond, the
              :     // value of the slot was initialized in the right way in InitAggSlots()
(e.g. 0 for
              :     // SUM) that we get the right result if UpdateSlot() pretends that the
NULL bit of
              :     // 'dst' is unse
> Please also see comments above.
I tried to clean this up a bit. My intention was to document the details of the initialisation
in InitAggSlot() and provide a pointer here.

PS7, Line 1675: dst.SetIsNull(slot_desc->CodegenIsNull(codegen, &builder, agg_tuple_arg));
> Why is this not in the old code or was it done somewhere else ?
It wasn't in the old code - the old code didn't handle arbitrary UDAs, just a subset of the
builtins. E.g. the old code would have given wrong results if we changed SUM() so that it
returned NULL on overflow.
File be/src/exec/partitioned-aggregation-node.h:

PS7, Line 649: 'dst_val' should contain the previous value of the aggregate
             :   /// function, and 'updated_dst_val' is set to the new value after the Update
or Merge
             :   /// operation is applied.
> Just curious why we separate 'dst_val' and 'result_val'. Doesn't 'result_va
We need two different CodegenAnyVal, since it really represents the value (which changes)
rather than the memory location. We could use an in-out argument instead of separate input
and output arguments but this seemed simpler to me.

I renamed 'result_val' to 'updated_dst_val' to hopefully make the connection clearer (the
naming was confusing).
File be/src/exprs/

PS7, Line 496:  false
> Please feel free to ignore but you can consider setting this second argumen
It seems clearer to make it explicit, given it's not obvious what the "default" behaviour
should be.
File tests/query_test/

Line 141:       # Verify codegen was enabled for all stages of the aggregation.
> Can remove this restriction now that IMPALA-3884 is in master

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message