Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5C499200BE8 for ; Fri, 23 Dec 2016 09:09:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5B15C160B1D; Fri, 23 Dec 2016 08:09:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 97F13160B37 for ; Fri, 23 Dec 2016 09:08:59 +0100 (CET) Received: (qmail 63831 invoked by uid 500); 23 Dec 2016 08:08:58 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 63819 invoked by uid 99); 23 Dec 2016 08:08:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2016 08:08:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7F7752C1F54 for ; Fri, 23 Dec 2016 08:08:58 +0000 (UTC) Date: Fri, 23 Dec 2016 08:08:58 +0000 (UTC) From: "Paul Rogers (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 23 Dec 2016 08:09:00 -0000 [ https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772269#comment-15772269 ] Paul Rogers edited comment on DRILL-5156 at 12/23/16 8:08 AM: -------------------------------------------------------------- The problem appears to be a bug in {{BootStrapContext}} which creates two thread pools, but does not close them. The two pools are for the "BitClient-n" and "BitServer-n" threads. During close, the {{BootStrapContext.close()}} method closes the allocator but leaves the threads running. Since they are left running, the BitClient thread attempts to use the (now closed) allocator and triggers the {{IllegalStateException}}. This behavior is easy to see by setting the breakpoint described above. Leave the thread stopped at that breakpoint. The rest of the Drillbit shuts down around the suspended thread, showing that the Drillbit did not wait for the thread. The fix is simple: {code} public void close() { try { loop2.shutdownGracefully(0, 0, TimeUnit.SECONDS); } catch ( Exception e ) { logger.warn("Failure During Bit-Client shutdown.", e); } try { loop.shutdownGracefully(0, 0, TimeUnit.SECONDS); } catch ( Exception e ) { logger.warn("Failure During Bit-Server shutdown.", e); } ... {code} After this fix, the test case runs fine with no {{IllegalStateException}}s. was (Author: paul-rogers): The problem appears to be a bug in {{BootStrapContext}} which creates two thread pools, but does not close them. The two pools are for the "BitClient-n" and "BitServer-n" threads. During close, the {{BootStrapContext.close()}} method closes the allocator but leaves the threads running. Since they are left running, the BitClient thread attempts to use the (now closed) allocator and triggers the {{IllegalStateException}}. This behavior is easy to see by setting the breakpoint described above. Leave the thread stopped at that breakpoint. The rest of the Drillbit shuts down around the suspended thread, showing that the Drillbit did not wait for the thread. The fix is simple: {code} public void close() { try { loop2.shutdownGracefully(0, 0, TimeUnit.SECONDS); } catch ( Exception e ) { logger.warn("Failure During Bit-Client shutdown.", e); } try { loop.shutdownGracefully(0, 0, TimeUnit.SECONDS); } catch ( Exception e ) { logger.warn("Failure During Bit-Server shutdown.", e); } ... {code} After this fix, the test case runs fine with no {{IllegalStateExceptions}}. > Bit-Client thread finds closed allocator in TestDrillbitResilience unit test > ---------------------------------------------------------------------------- > > Key: DRILL-5156 > URL: https://issues.apache.org/jira/browse/DRILL-5156 > Project: Apache Drill > Issue Type: Bug > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Minor > > RPC thread attempts to access a closed allocator during the {{TestDrillbitResilience}} unit test. > Set a Java exception breakpoint for {{IllegalStateException}}. Run the {{TestDrillbitResilience}} unit tests. > You will see quite a few exceptions, including the following in a thread called BitClient-1: > {code} > RootAllocator(BaseAllocator).assertOpen() line 109 > RootAllocator(BaseAllocator).buffer(int) line 191 > DrillByteBufAllocator.buffer(int) line 49 > DrillByteBufAllocator.ioBuffer(int) line 64 > AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104 > NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117 > ... > NioEventLoop.run() line 354 > {code} > The test continues (then fails for some other reason), which is why this is marked as minor. Still, it seems odd that the client thread should attempt to access a closed allocator. > At this point, it is not clear how we got into this state. The test itself is waiting for a response from the server in the {{tailsAfterMSorterSorting}} test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)