impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4883: Union Codegen
Date Fri, 31 Mar 2017 16:23:33 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4883: Union Codegen

Patch Set 5:


Getting pretty close, just minor cleanup at this point.

I also just wanted to check with Michael that the tuple_pool_ approach made the most sense
for now - we'll need to clean that up as part of his codegen work but I don't think it makes
sense to fix in this patch.
File be/src/exec/

Line 34: 
> We can avoid checking limits for each row if we check it at the end and tru
It looks like we already do this for the passthrough case so we might as well do it here.
File be/src/exec/

Line 19: #include "runtime/tuple.h"
Do we need tuple.h? I don't think I see any references to Tuple* in here.

Line 21: #include "util/runtime-profile-counters.h"
Is this needed still?

Line 35:   while (!dst_batch->AtCapacity() && child_row_idx < child_batch->num_rows())
Nice! We can maybe avoid a few more loads and stores via the child_batch and tuple_buf pointers.

int child_batch_rows = child_batch->num_rows().
uint8_t* curr_tuple = *tuple_buf;
*tuple_buf = curr_tuple.

Line 46:   if (limit_ != -1 && num_rows_returned_ + dst_batch->num_rows() >
limit_) {
We don't need to cross-compile this logic. Let's move it into the caller and save LLVM some

Although, see my comment about moving this logic to GetNext() and sharing it for all three
File be/src/exec/

Line 168:   if (limit_ != -1 && num_rows_returned_ + row_batch->num_rows() >
limit_) {
How about we move this logic around num_rows_returned_ and limit_ into GetNext()? I believe
the same logic can work for all three cases if we slightly extend it so that it handles the
case when GetNext() is called with a non-empty batch, which can happen in a subplan.
File be/src/exec/union-node.h:

PS1, Line 71:  in this poo
> Ok, this can be removed in the future.
Michael, do you think it makes sense to use this approach for now? I'm not that familiar with
CodegenMaterializeExprs() so not sure if there is a better way to do this.
File be/src/exec/union-node.h:

Line 28: #include "runtime/tuple.h"
Do we need the tuple.h and tuple-row.h imports? Oh I guess for the inline MaterializeExprs
function, but we can move that to the file anyway.

Line 72:   /// each GetNext() call.
We should add a TODO to remove this. Maybe Michael knows if there's a JIRA that will allow
us to remove it.

PS5, Line 99: Null
NULL here and below, just for consistency with other comments.

PS5, Line 128: row_batch

Line 136:   void IR_ALWAYS_INLINE MaterializeExprs(const std::vector<ExprContext*>&
Move this to the file? I don't think there's a reason we need to define it in the .h

Line 148:   bool inline IsChildPassthrough(int child_idx) const {
I don't think any of the "inline" specifiers here and below do anything - if the function
is defined in the class body it implicitly has an "inline" hint.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message