Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A9F57200AE2 for ; Fri, 27 May 2016 23:44:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A85EB160A12; Fri, 27 May 2016 21:44:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A5095160A10 for ; Fri, 27 May 2016 23:44:07 +0200 (CEST) Received: (qmail 10238 invoked by uid 500); 27 May 2016 21:44:06 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 10227 invoked by uid 99); 27 May 2016 21:44:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 May 2016 21:44:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 40B15C0D1F for ; Fri, 27 May 2016 21:44:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 4E2uJCkTr0gB for ; Fri, 27 May 2016 21:44:04 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 4202F5F244 for ; Fri, 27 May 2016 21:44:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u4RLi3UA011211; Fri, 27 May 2016 21:44:03 GMT Message-Id: <201605272144.u4RLi3UA011211@ip-10-146-233-104.ec2.internal> Date: Fri, 27 May 2016 21:44:03 +0000 From: "Tim Armstrong (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org CC: Dan Hecht , Matthew Jacobs Reply-To: tarmstrong@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?[Impala-CR](cdh5-trunk)_IMPALA-3344:_Simplify_sorter_and_document/enforce_invariants.=0A?= X-Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a X-Gerrit-ChangeURL: X-Gerrit-Commit: 3803e4c7e6d7da9561b4869424deb1bc744f1a30 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.10-rc0 archived-at: Fri, 27 May 2016 21:44:08 -0000 Tim Armstrong has posted comments on this change. Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants. ...................................................................... Patch Set 14: (64 comments) http://gerrit.cloudera.org:8080/#/c/2826/14/be/src/runtime/sorted-run-merger.cc File be/src/runtime/sorted-run-merger.cc: Line 55: It may not succeed in advancing to the next row if : /// the next batch is empty and has attached resources > Is that true? it looks like the loop below will continue to the next batch Old comment. Deleted. Line 86: one row including current_row() remaining in the : /// batch. > i'm not sure what this is saying. also, looks like it's only called in Adv Done http://gerrit.cloudera.org:8080/#/c/2826/14/be/src/runtime/sorted-run-merger.h File be/src/runtime/sorted-run-merger.h: Line 68: void TransferAllResources(RowBatch* transfer_resource_batch); > i don't see any callers or implementation of this. looks like it should be Done http://gerrit.cloudera.org:8080/#/c/2826/14/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: Line 71: If the run is unsorted, it is sorted in-place in : /// memory. Once sorted, runs can be spilled to disk if the sorter does not have enough : /// memory. > I think this could be clearer if the cause-effect relationship is clearer. I think that's starting to describe the algorithm implemented by the sorter, rather than the Run abstraction. If the sorter wanted to build shorter in-memory runs then merge them, then I don't think we'd change the Run abstraction. Line 83: > (if there was a single run) ? Done Line 83: > (for merging if there are multiple runs) ? Done Line 89: ~Run() { DeleteAllBlocks(); } > it would be best to avoid doing buffer management in the destructor. can yo Done Line 94: ( > why parenthesis? Removed them. Line 102: start_index > document Done Line 103: return AddBatch(batch, start_index, num_processed); > It will be DCHECKed inside, but adding this dcheck might clarify tot he rea Done Line 105: Status AddIntermediateBatch(RowBatch* batch, int start_index, int* num_processed) { > likewise, for documentation purpoes: Done Line 109: AddBatch > Add*Batch()? Done Line 131: . > is this correct: (otherwise, var-length slots already contain pointers rath Yeah, if the run stays in-memory we don't ever convert them. Line 132: when > after? (i.e. it won't be set on the last batch that returns the final rows) Done Line 132: read > returned? Done Line 133: . > and has no attached blocks. Done Line 134: 2 > two (to match "one") Done Line 136: 1 (2 > one (two Done Line 137: unpinned (Sorter holds on to the memory). > is this true? don't we transfer the blocks as they are completed? Yeah that was wrong. Reworded. Line 140: If we leave the last run to be merged in memory, the fixed-len blocks can be : /// unpinned as they are consumed. > aren't the blocks transferred in this case? so what's left to do? I should have deleted this comment. Line 145: deleted > is that accurate? or should it say "transferred"? Done Line 161: AddBatch > this naming (with AddBatchInternal) is a little confusing given that AddBat I moved if() into AddInputBatch()/AddIntermediateBatch(). Line 231: const int last_tuple_block_offset_; > do we move between blocks often enough to make precalculating this worthwhi Done. Line 247: /// True if the tuples in the run are currently in sorted order. > Always true for intermediate runs. Done Line 254: deleted > transferred or deleted? Done Line 263: deleted > transferred or deleted? Done Line 268: allocated > remove Done Line 268: FinalizeInput > why do it during FinalizeInput() rather than just always do it at UnpinAllB If it's a single in-memory run, we won't call UnpinAllBlocks(). I guess we still keep var_len_copy_block_ around for in-memory runs with var-len data, so maybe I'll just remove this special-case optimization. It would be nice to delete var_len_copy_block_ earlier in the cases when we won't needed it, but it's probably simpler to just delete it when the run is finished. Left a TODO Line 269: UnpinAllBlocks > UnpinBlocks() doesn't literally delete that block, so maybe clarify this a Removed the code it was explaining. Line 312: begin > Begin Done Line 315: end > End Done Line 319: . > also updates 'index_'? Done Line 348: invalid index > can this be more specific (does e.g. Prev(Next()) always get you back to wh -1 and run->num_rows() are the only valid out of range values, so I think that makes it clearer combined with the Next()/Prev() comments. Line 492: else sorter_->spilled_runs_counter_->Add(1); > do we use one-line syntax if there is an else clause? I'm not sure. Added {} to avoid it being too quirky. Line 622: > nit: extraneous blank lines Done Line 633: Intermediate merge runs are unpinned. > not sure what this is trying to tell me. that intermediate runs are always I think the explanation is simpler: the run is unpinned, so all blocks in the run should be unpinned now that we're done writing them. Moved the DCHECK and updated the comment. Line 651: DCHECK(is_sorted_); > DCHECK(is_initial_) also? Done Line 664: if (HasVarLenBlocks()) { The var_len_copy_block_ change (not deleting in FinalizeInput()) required a change here, since it broke the assumption that !HasVarLenBlocks() implies no copy block. Line 729: // and the individual blocks do not need to be pinned. > Is there a more direct way of explaining/checking this? Is this the case w Yes, I think the point is that the blocks are already all pinned. The comment was actually wrong about buffered_batch, so i fixed that. Line 775: // if it did, we would have to return them to the caller. > This comment is confusing. Is it saying that the assumption might not be va Done Line 781: > nit: many of these blank lines don't help readability and cause less code t Done Line 807: DeleteAllBlocks(); I pulled this out of the condition, because the new invariant is that GetNext() with eos should clear the block vectors, even if all the blocks in the vector are NULL. Line 812: } else { > maybe get rid of this else since L813 is really the start of the case at 81 Moved it to just before the return. Line 840: // The var-len data is in the next block, so end this call to GetNext(). > would be good to explain "why" rather than "what". is it so the var length Done Line 853: > nit: remove blank line Done Line 874: output_batch->AddBlock(var_len_blocks_[var_len_blocks_index_]); > couldn't we have attached this var len block above, and so now we're passin Because we only advance the var-len block when we see a block index > the current block encoded in a pointer. There won't be any encoded block indexes past the last block. Line 883: > nit: remove blank line Done Line 904: RETURN_IF_ERROR(next_block->Pin(&pinned, curr_block, false)); > if there is an error here, is it okay that curr_block is still in *blocks? Good point, that was a bug. Line 1047: tuple > block containing the last tuple? Done. Had to reword next sentence a bit. Line 1050: index_ = run->num_tuples(); > isn't this done already? Done Line 1070: Can't move to block past the end. > I was confused by this comment until reading the code because it's referrin Done Line 1088: // Can't move to block before the first. > similarly: When moving before the first tuple, stay at the first block. Done Line 1341: block_mgr_->ClearReservations(block_mgr_client_); > it'd be best to not do all this buffer management work in the destructor, b Added Sorter::Close() to match the pattern elsewhere. Line 1408: if (block_mgr_->num_free_buffers() < min_buffers_for_merge - blocks_per_run) { > I don't understand why we shouldn't just always unpin this last run, given I agree. See http://gerrit.cloudera.org/#/c/2908/ I deliberately avoided making algorithmic changes to this part of the code in this patch, since I think it was easier to reason about that set of changes independent of this cleanup. I think you're right about the var_len_copy_block_ case (it holds onto it for longer than needed) but I think we can defer that until the follow-up patch. Line 1441: // in the pinned blocks in the single sorted run. > Is this comment still relevant? What is it trying to tell me? Yeah we handle resource transfer in Run::GetNext() so this isn't helpful. Line 1444: DCHECK_EQ(output_batch->num_rows(), 0); > what is this DCHECK documenting? why is this important on this path? (also You're right, I don't think this is important any more. Line 1448: // resource. > is this comment still relevant given the cleanup to memory management? isn' Yeah, that's right. Removed the comment. Line 1456: merging_runs_.clear(); This invariants around Reset() aren't clear. I added logic here to clean up rows, assuming that GetNext() may not have returned eos before the Reset(). I'm going to talk to Alex to figure out whether this is even allowed. Line 1504: Run* merged_run = obj_pool_.Add( merged_run is a case where we relied on the destructor calling DeleteAllBlocks(), in case we hit an error before adding it to sorted_runs_. I fixed this by adding it immediately to sorted_runs_. http://gerrit.cloudera.org:8080/#/c/2826/14/be/src/runtime/sorter.h File be/src/runtime/sorter.h: Line 48: /// Batches of input rows are collected into a sequence of pinned BufferedBlockMgr blocks I cleaned up this paragraph a little. Line 60: After the input is consumed, the sorter is left with one or more sorted runs. The : /// client calls GetNext(output_batch) to retrieve batches of sorted rows. If there are : /// multiple runs, the runs are merged using SortedRunMerger to produce a stream of sorted : /// tuples. At least one block per run (two if there are var-length slots) must be pinned : /// in memory during a merge, so multiple merges may be necessary if the number of runs is : /// too large. > not your change but this paragraph is kind of confusing because it talks ab Reorganised to explain it in a more logical order. Line 77: final merge) > again confusing because the comment didn't really explain how merging happe I think it should be clearer now. Line 137: Finalizes > not sure what it means to "finalize" unsorted_run_ from just reading this h Reworded. I think it's clearer to just that that it's called when there are no more runs to be appended. Line 199: unsorted > they become sorted, right? so maybe just say "initial runs"? Done -- To view, visit http://gerrit.cloudera.org:8080/2826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a Gerrit-PatchSet: 14 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes