impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4883: Union Codegen
Date Thu, 30 Mar 2017 22:59:40 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4883: Union Codegen
......................................................................


Patch Set 4:

(2 comments)

Hmm, I guess the per-row overhead is probably not as significant for the case when we're returning
a bunch of columns. There might be a more pronounced effect if we're just returning a handful
of columns. 

I think we can still squeeze some more cycles out of this with minimal effort, but if we stop
seeing a measurable improvement for a query that returns a smaller number of columns we could
consider stopping then.

http://gerrit.cloudera.org:8080/#/c/6459/4/be/src/exec/union-node-ir.cc
File be/src/exec/union-node-ir.cc:

Line 28:   while (!dst_batch->AtCapacity() && child_row_idx_ < child_batch_->num_rows())
{
The remaining optimisation is to pull all references to member variables out of the loop.
I.e. child_row_idx_, child_batch_, child_row_idx_, child_expr_lists_, tuple_desc_->byte_size().
This will reduce the number of loads and stores quite a bit. E.g.

int child_row_idx = child_row_idx_;
int tuple_byte_size = tuple_desc_->byte_size;
while (...) {

}
child_row_idx_ = child_row_idx;

Currently it will do a load and store to variables like 'child_row_idx_' via the 'this' pointer
on every loop iteration.

The compiler could do that automatically if it could deduce that the values aren't modified
via a different pointer, but I don't think it's deducible in this case because the compiler
has  to generate code that's "correct" in a weird case like 'this' and *tuple_buf pointing
to the same memory.


Line 34:     if (ReachedLimit()) break;
We can avoid checking limits for each row if we check it at the end and truncate the batch
using RowBatch::set_num_rows. E.g. see SortNode::GetNext().


-- 
To view, visit http://gerrit.cloudera.org:8080/6459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message