Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 37682200CB3 for ; Mon, 26 Jun 2017 11:35:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 35E4C160BF5; Mon, 26 Jun 2017 09:35:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7C97E160BD9 for ; Mon, 26 Jun 2017 11:35:06 +0200 (CEST) Received: (qmail 8461 invoked by uid 500); 26 Jun 2017 09:35:04 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 8386 invoked by uid 99); 26 Jun 2017 09:35:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jun 2017 09:35:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id F0AB31A7A9E for ; Mon, 26 Jun 2017 09:35:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id t-uSL0_BHWcD for ; Mon, 26 Jun 2017 09:35:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 9364D5FC4D for ; Mon, 26 Jun 2017 09:35:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7666DE069F for ; Mon, 26 Jun 2017 09:35:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4834C240CD for ; Mon, 26 Jun 2017 09:35:00 +0000 (UTC) Date: Mon, 26 Jun 2017 09:35:00 +0000 (UTC) From: "Roman (JIRA)" To: dev@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (DRILL-5609) Resources leak on parquet table when the query hangs with CANCELLATION_REQUESTED state MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 26 Jun 2017 09:35:07 -0000 Roman created DRILL-5609: ---------------------------- Summary: Resources leak on parquet table when the query hangs with CANCELLATION_REQUESTED state Key: DRILL-5609 URL: https://issues.apache.org/jira/browse/DRILL-5609 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.11.0 Reporter: Roman I tried to run tpcds_sf100-query2 on parquet table in 10 concurrency threads on single node drillbit cluster (I use Drill with DRILL-5599 fix) and caught a resources leak. The query hanged in CANCELLATION_REQUESTED state. Steps to reproduce: 1) Start ConcurrencyTest.java with tpcds_sf100-query2 on parquet table (in attachment); 2) Wait 3-5 seconds and make Ctrl+c to kill a client. 3) Retry step 2) several times until you get "CANCELLATION_REQUESTED" on some queries. Queries will hang until drillbit restart. If we make "top", we can see that drillbit uses CPU. Jstack example: {code:xml} "26af36b2-7a44-5af8-e0c3-95a4f132fc7a:frag:14:1" #1268 daemon prio=10 os_prio=0 tid=0x00007f25a5afa800 nid=0x16f2 runnable [0x00007f2535a5a000] java.lang.Thread.State: RUNNABLE at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.fillInStackTrace(Throwable.java:783) - locked <0x0000000728ca82b0> (a java.lang.InterruptedException) at java.lang.Throwable.(Throwable.java:250) at java.lang.Exception.(Exception.java:54) at java.lang.InterruptedException.(InterruptedException.java:57) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:439) at org.apache.drill.exec.store.parquet.columnreaders.AsyncPageReader.clear(AsyncPageReader.java:301) at org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.clear(ColumnReader.java:147) at org.apache.drill.exec.store.parquet.columnreaders.ReadState.close(ReadState.java:179) at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.close(ParquetRecordReader.java:318) at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:209) at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:133) at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) at org.apache.drill.exec.physical.impl.broadcastsender.BroadcastSenderRootExec.innerNext(BroadcastSenderRootExec.java:95) at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234) at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227) at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {code} I added drillbit.log and full jstack log in attachments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)