Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ECDB1200AE3 for ; Thu, 5 May 2016 01:27:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EB8181609FF; Wed, 4 May 2016 23:27:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3DEFC1609FC for ; Thu, 5 May 2016 01:27:07 +0200 (CEST) Received: (qmail 18986 invoked by uid 500); 4 May 2016 23:27:06 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 18973 invoked by uid 99); 4 May 2016 23:27:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 May 2016 23:27:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id AF9C8C015A for ; Wed, 4 May 2016 23:27:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 3oGhkq_kbZ84 for ; Wed, 4 May 2016 23:27:03 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 1483A5FB1E for ; Wed, 4 May 2016 23:27:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u44NR1Vu009915; Wed, 4 May 2016 23:27:01 GMT Message-Id: <201605042327.u44NR1Vu009915@ip-10-146-233-104.ec2.internal> Date: Wed, 4 May 2016 23:27:01 +0000 From: "Tim Armstrong (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org CC: Matthew Jacobs , Dan Hecht Reply-To: tarmstrong@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?[Impala-CR](cdh5-trunk)_IMPALA-3344:_Simplify_sorter_and_document/enforce_invariants.=0A?= X-Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a X-Gerrit-ChangeURL: X-Gerrit-Commit: 46667ca5ad40f456925be0ab9017a552743601fc In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.10-rc0 archived-at: Wed, 04 May 2016 23:27:08 -0000 Tim Armstrong has posted comments on this change. Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants. ...................................................................... Patch Set 7: (8 comments) http://gerrit.cloudera.org:8080/#/c/2826/7/be/src/runtime/sorted-run-merger.cc File be/src/runtime/sorted-run-merger.cc: Line 33: BatchedRowSupplier > A few high level things: I couldn't think of anything much clearer. Done. 2) See my earlier comment - I think the right thing to do here is to avoid handling need_to_return() and simplify this. I think the three general alternatives here are a) copy the rows so they don't reference anything passed up from the sorter, b) do the complicated TryAdvance() handshake so that we can free resources before moving to the next row or c) use double the resources while transitioning between blocks. It doesn't seem like the copying is a major performance bottleneck right now, so it's probably the best approach until it shows up on profiles. Line 55: exhausted > exhausted, Done Line 57: resouce > resources Done Line 59: Next > Per my larger comment at the top, how about: I think 1) and 2) aren't relevant any more since I removed handling of the need_to_return case. I did rename Next() to Advance(). 3) is done. Line 86: there is at least one row remaining in the batch. > ...at least one row (including current_row()) remaining in the current batc Done Line 87: before 'done' is returned. > before Next() returns with 'done' set to true Done http://gerrit.cloudera.org:8080/#/c/2826/7/be/src/runtime/sorted-run-merger.h File be/src/runtime/sorted-run-merger.h: Line 69: If 'deep_copy_input_' is false > If true, this should always be able to advance, right? Can you maybe add th With the other changes this got simplier. Line 78: AdvanceMinRow > TryAdvanceMinRow? See above. -- To view, visit http://gerrit.cloudera.org:8080/2826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a Gerrit-PatchSet: 7 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes