impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3286: Software prefetching for hash table build.
Date Thu, 28 Apr 2016 20:26:18 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-3286: Software prefetching for hash table build.
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 296:   TupleRow* expr_values_row_;
> I personally feel like the original design where the buffers are embedded i
Thanks for the pointer to that patch. That will be very helpful. My preference would be to
get this code in for 2.6 soon and do a separate clean up (as in brushing up your patch which
also seems a bit non-trivial). Your patch may be helpful as a follow-up if we want to try
the idea of saving the values of build expression evaluation computed during prefetching.


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 337:  hash_values_.reset(new uint32_t[state->batch_size()]);
> Even if the input batch is big, could you just process a subset of it at a 
I tried implementing that too but that's a nested loop inside the FOREACH_ROW iterator. The
code looks rather complicated but it's doable. If we can set a reasonable max size for a row
batch when it's created, things may be easier. Is 1024 a reasonable size ? I know it may not
make sense for rows with say a single tinyint but is it a reasonable number for the common
case ?


-- 
To view, visit http://gerrit.cloudera.org:8080/2896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib85e7fc162ad25c849b9e716b629e226697cd940
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message