impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3286: Prefetching for PHJ probing.
Date Sat, 14 May 2016 23:58:13 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3286: Prefetching for PHJ probing.
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/2959/7/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

Line 97:           malloc(sizeof(Tuple*) * num_build_tuples))) {
you don't have to do it now, but now that we have an Init() routine, it'd be good to move
things that can fail like this  malloc there so we can return a status.


Line 853: expr_values_bytes_per_row > 0)
why do we have this check? in Init() we're DCHECK_GT zero.  Is that dcheck incorrect or is
this guard unnecessary?


http://gerrit.cloudera.org:8080/#/c/2959/7/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 155: results in
evaluation to the current row of the ExprValuesCache in 'ht_ctx': the results are saved in
'cur_expr_values_', the nullness of ...


Line 186:  
and the corresponding hash table buckets are prefetched before ...
(to explain why pipelining is necessary)


Line 230: Resets the iterators and end pointers to the start before writing
Resets the cache before starting a write phase.
(to be more explicit that this means the old contents are gone, not just iterator resetting).


Line 533: Returns an iterator to the bucket matching the value of the current row evaluated
        :   /// against the probe expressions. It's assumed that EvalAndHashProbe() has been
called
        :   /// and the evaluated values in the expression values cache of 'ht_ctx' will be
used.
how this all works might still be a bit confusing to readers who don't know all the details.
maybe clarify with something like:

Returns an iterator to the bucket that matches the probe expression results that are cached
at the current position of the ExprValuesCache in 'ht_ctx'.  Assumes that the ExprValuesCache
was filled using EvalAndHashProbe().


http://gerrit.cloudera.org:8080/#/c/2959/7/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 1034:     ExprContext::FreeLocalAllocations(probe_expr_ctxs_);
seems like we could move this to where we re-fill the expr values cache (which will happen
more often), but it's probably okay here too. mostly wondering if there's a reason why this
is better?


-- 
To view, visit http://gerrit.cloudera.org:8080/2959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message