impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3286: Prefetching for PHJ probing.
Date Sat, 14 May 2016 04:17:18 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3286: Prefetching for PHJ probing.

Patch Set 6:

File be/src/exec/hash-table.h:

Line 316: values
value (since "each")

Line 346: value's
values (plural not possessive) and could delete "buffer"
File be/src/exec/

Line 373:   bool has_probe_rows = current_probe_row_ != NULL || !probe_batch_iterator.AtEnd();
I find the positive conditionals easier to reason about, but if you or Tim find the current
code easier, than okay to leave:

// If there's no left over from the previous call and we've iterated the entire row batch,
then we're done with this batch.
bool batch_done = current_probe_row == NULL && probe_batch_iterator.AtEnd();

while (!batch_done ...) {
  batch_done = current_probe_row == NULL;

Line 383: Prefetching if enabled
Prefetching, if enabled,

Line 387: In this case, prefetching may not be effective as there aren't
        :     // enough cycles for prefetched items to trickle in.
this sentence is still confusing. I think what you mean is that we shouldn't bother to prefetch
in this case, but it makes it sound like we do prefetch but it might not be effective. I think
just delete this last sentence.

Line 410:     has_probe_rows = current_probe_row_ != NULL;
maybe add:
if (!has_probe_rows) DCHECK(probe_batch_iterator.AtEnd());
that implication has to hold, correct?

Line 413:         probe_batch_->num_tuples_per_row();
just to check my understanding, if we're AtEnd(), then probe_batch_pos_ = batch->num_rows_,
since Get() will be one past the last? is that right?

Line 475: cur_row
how did the old code work?

Line 494:         if (UNLIKELY(!hash_tbl_->Insert(ht_ctx, row_idx, row))) {
can combine these two if-stmts
File be/src/exec/

Line 677:     // 'probe_exp_ctxs_' should have made no local allocations in this function.
is there a simple DCHECK we could write to verify that?

Line 1032:     DCHECK(probe_batch_pos_ == -1 || ht_ctx_->expr_values_cache()->AtEnd());
why can't this just be:
DCHECK(ht_ctx_->expr_values_cache()->AEnd()); ?
which is what the comment says.

Line 1754:   DCHECK(process_probe_batch_fn->getLinkage() == GlobalValue::WeakODRLinkage)
out of curiosity why does this end up weak now but not before?
File be/src/exec/partitioned-hash-join-node.h:

Line 252: probing

Line 254:  

Line 255: free space
doesn't this always reset the cache? "free space" seems to imply it won't but instead just
uses what's remaining.  maybe say "depends on the capacity of ..."

Line 579:  
Document the return value.

Line 579: constant. 
a constant
File be/src/runtime/row-batch.h:

Line 409: . Will iterate up till the number
        : /// of rows in the given row batch if  '_start_row_idx + _limit' is beyond the last
I was just being dumb when I didn't notice how limit works. I think you should just delete
this sentence like you had it before.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Huaisi Xu <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message