Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 10B06200D71 for ; Thu, 21 Dec 2017 00:00:16 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 0EFFF160C15; Wed, 20 Dec 2017 23:00:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7BC63160BF9 for ; Thu, 21 Dec 2017 00:00:15 +0100 (CET) Received: (qmail 65470 invoked by uid 500); 20 Dec 2017 23:00:14 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 65456 invoked by uid 99); 20 Dec 2017 23:00:14 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Dec 2017 23:00:14 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 29C87DFF59; Wed, 20 Dec 2017 23:00:14 +0000 (UTC) From: paul-rogers To: dev@drill.apache.org Reply-To: dev@drill.apache.org References: In-Reply-To: Subject: [GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS... Content-Type: text/plain Message-Id: <20171220230014.29C87DFF59@git1-us-west.apache.org> Date: Wed, 20 Dec 2017 23:00:14 +0000 (UTC) archived-at: Wed, 20 Dec 2017 23:00:16 -0000 Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r158158542 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -341,6 +351,10 @@ public void close() throws Exception { updateAggregateStats(); partitioner.clear(); } + + if (closeIncoming) { + ((CloseableRecordBatch) incoming).close(); + } --- End diff -- Under what scenario would we *not* close the incoming? This fix introduces two paths: one that closes, one that does not. Is the non-closing path needed? More to the point, do we have unit tests that verify both paths? Can we simplify this and just define that this operator will close the incoming? Actually, let's take a step back. This fix is based on a false premise (but one a mistake that is easy to make.) This fix assumes that a parent batch (operator) is responsible for closing its child. But, that is not how the Drill protocol works. Instead, the fragment executor maintains the operator tree. Upon exit, it walks the tree and closes all its children in order from top down. So, the true source of this bug is that either a) the fragment exec is failing to do the close, b) that the incoming is not releasing memory on close, or c) that some close operation reallocates memory that a previous close tried to release. Let's suppose the issue is a: that the fragment exec does not manage the incoming. Let's fix that. Similar with the other cases. Can you maybe do a bit more research to understand the full scope of the issue? ---