impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3344: Simplify sorter and document/enforce invariants.
Date Wed, 25 May 2016 22:32:50 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants.

Patch Set 14:


Nice cleanup.
File be/src/runtime/

Line 55: It may not succeed in advancing to the next row if
       :   /// the next batch is empty and has attached resources
Is that true? it looks like the loop below will continue to the next batch in this case (attaching
resources), so Advance() will always succeed at setting up the next row.

Line 86: one row including current_row() remaining in the
       :   /// batch. 
i'm not sure what this is saying.  also, looks like it's only called in Advance() so how about
just inlining it there (or at least make it private)?
File be/src/runtime/sorted-run-merger.h:

Line 68:   void TransferAllResources(RowBatch* transfer_resource_batch);
i don't see any callers or implementation of this.  looks like it should be removed.
File be/src/runtime/

Line 71: If the run is unsorted, it is sorted in-place in
       : /// memory. Once sorted, runs can be spilled to disk if the sorter does not have
       : /// memory.
I think this could be clearer if the cause-effect relationship is clearer. Something like:

Initial runs continue to grow as long as more blocks can be allocated. If a new block cannot
be allocated, then the run is sorted in-place and spilled to disk.  Then a new initial run
is started.

Line 83:  
(for merging if there are multiple runs) ?

Line 83:  
(if there was a single run) ?

Line 89:   ~Run() { DeleteAllBlocks(); }
it would be best to avoid doing buffer management in the destructor. can you move this some
place explicit and instead make the destructor dcheck that the run has no blocks?

Line 94: (
why parenthesis?

Line 102: start_index

Line 103:     return AddBatch<true>(batch, start_index, num_processed);
It will be DCHECKed inside, but adding this dcheck might clarify tot he reader that this can
only be called on initial runs:


Line 105:   Status AddIntermediateBatch(RowBatch* batch, int start_index, int* num_processed)
likewise, for documentation purpoes:

Line 109: AddBatch

Line 131: .
is this correct: (otherwise, var-length slots already contain pointers rather than offsets)

Line 132: when
after? (i.e. it won't be set on the last batch that returns the final rows).

Line 132: read

Line 133: .
and has no attached blocks.

Line 134: 2
two (to match "one")

Line 136: 1 (2
one (two

Line 137: unpinned (Sorter holds on to the memory).
is this true? don't we transfer the blocks as they are completed?

Line 140:  If we leave the last run to be merged in memory, the fixed-len blocks can be
        :   /// unpinned as they are consumed.
aren't the blocks transferred in this case? so what's left to do?

Line 145: deleted
is that accurate? or should it say "transferred"?

Line 161: AddBatch
this naming (with AddBatchInternal) is a little confusing given that AddBatch() itself is
internal (i.e. private). Maybe some other way to distinguish/factor?

Line 231:   const int last_tuple_block_offset_;
do we move between blocks often enough to make precalculating this worthwhile? (probably no
given that we do a multiply by capicity_ already in that function).

Line 247:   /// True if the tuples in the run are currently in sorted order.
Always true for intermediate runs.

Line 254: deleted
transferred or deleted?

Line 263: deleted
transferred or deleted?

Line 268: allocated

Line 268: FinalizeInput
why do it during FinalizeInput() rather than just always do it at UnpinAllBlocks() time?

Line 269: UnpinAllBlocks
UnpinBlocks() doesn't literally delete that block, so maybe clarify this a bit.

Line 312: begin

Line 315: end

Line 319: .
also updates 'index_'?

Line 348: invalid index
can this be more specific (does e.g. Prev(Next()) always get you back to where you were or
is that not the case if Next() got you to an illegal index)? like is it -1 or #tuples in those

Line 492:   else sorter_->spilled_runs_counter_->Add(1);
do we use one-line syntax if there is an else clause?

Line 622: 
nit: extraneous blank lines

Line 633: Intermediate merge runs are unpinned.
not sure what this is trying to tell me. that intermediate runs are always !is_pinned_, or
is it just saying what the next line of code is doing?

In any case, I think more explanation for the DCHECK at line 632 would be good, which could
summarize both points I think:

Initial runs will always be pinned at this point (they still need to be sorted, and might
be unpinned after sorting). Intermediate runs always have is_unpinned_.

So actually, couldn't the dcheck be stronger. move out of if-stmt and:
DCHECK_EQ(is_initial_, is_pinned_);

Line 651:   DCHECK(is_sorted_);
DCHECK(is_initial_) also?

Line 729:   // and the individual blocks do not need to be pinned.
Is there a more direct way of explaining/checking this?  Is this the case where there was
only one run?  maybe from the run the is_pinned_ check is the only way to differentiate this
case, but at least let's improve the comment.

Line 775:     // if it did, we would have to return them to the caller.
This comment is confusing. Is it saying that the assumption might not be valid (why else does
it tell us what we have to do if the resources are attached)?  Rather than what we assume,
let's just say what the invariant is:

Setting output_batch to NULL signals eos to the caller, so GetNext() is not allowed to attach
resources to the batch on eos.

Line 781: 
nit: many of these blank lines don't help readability and cause less code to fit on the screen.

Line 812:   } else {
maybe get rid of this else since L813 is really the start of the case at 816.

Line 840:       // The var-len data is in the next block, so end this call to GetNext().
would be good to explain "why" rather than "what". is it so the var length block can be attached/deleted?

Line 853: 
nit: remove blank line

Line 874:         output_batch->AddBlock(var_len_blocks_[var_len_blocks_index_]);
couldn't we have attached this var len block above, and so now we're passing NULL to AddBlock()?
Is that okay/intentional?  also, why wouldn't the code above (in the convert offsets path)
have always taken care of this on the last block?

Line 883: 
nit: remove blank line

Line 904:   RETURN_IF_ERROR(next_block->Pin(&pinned, curr_block, false));
if there is an error here, is it okay that curr_block is still in *blocks?  Pin() always deletes
release_block, right?

Line 1047: tuple
block containing the last tuple?

Line 1050:     index_ = run->num_tuples();
isn't this done already?

Line 1070: Can't move to block past the end.
I was confused by this comment until reading the code because it's referring to a non-existing
block.  Maybe say:

// When moving beyond the last tuple, stay at the last block.

or similar

Line 1088:   // Can't move to block before the first.
similarly: When moving before the first tuple, stay at the first block.

Line 1341:   block_mgr_->ClearReservations(block_mgr_client_);
it'd be best to not do all this buffer management work in the destructor, but instead have
it explicitly do this at the right point in time.  and then make the destructor just dcheck
that the cleanup already happened.

Line 1408:   if (block_mgr_->num_free_buffers() < min_buffers_for_merge - blocks_per_run)
I don't understand why we shouldn't just always unpin this last run, given that there's more
than one run meaning the sum of all runs can't possibly fit in memory (at least given the
memory we had while consuming input). The original code wouldn't have thought that there would
now be more buffers available (because the BBM was per-node), but maybe that's what we're
hoping for here?

Also not doing UnpinAllBlocks() always makes it harder to reason about the var_len_copy_block_

Line 1441:     // in the pinned blocks in the single sorted run.
Is this comment still relevant? What is it trying to tell me?

Line 1444:       DCHECK_EQ(output_batch->num_rows(), 0);
what is this DCHECK documenting? why is this important on this path? (also if keeping it,
make it one line).

Line 1448:     // resource.
is this comment still relevant given the cleanup to memory management? isn't that up to the
merger_? What does it have to do with Sorter resources?
File be/src/runtime/sorter.h:

Line 60: After the input is consumed, the sorter is left with one or more sorted runs. The
       : /// client calls GetNext(output_batch) to retrieve batches of sorted rows. If there
       : /// multiple runs, the runs are merged using SortedRunMerger to produce a stream
of sorted
       : /// tuples. At least one block per run (two if there are var-length slots) must be
       : /// in memory during a merge, so multiple merges may be necessary if the number of
runs is
       : /// too large.
not your change but this paragraph is kind of confusing because it talks about GetNext() but
then talks about stuff that might have happened earlier (intermediate merges) but isn't very
clear about that.

Line 77:  final merge)
again confusing because the comment didn't really explain how merging happens (intermediate
vs final). would be good to clarify the algorithm.

Line 137: Finalizes
not sure what it means to "finalize" unsorted_run_ from just reading this header. is there
a concise way to explain that (at least the part that relevant to the caller)?

Line 199: unsorted
they become sorted, right? so maybe just say "initial runs"?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a
Gerrit-PatchSet: 14
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message