Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7E5DB200D0B for ; Wed, 27 Sep 2017 18:07:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7CCF21609BC; Wed, 27 Sep 2017 16:07:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 725511609EB for ; Wed, 27 Sep 2017 18:07:05 +0200 (CEST) Received: (qmail 76921 invoked by uid 500); 27 Sep 2017 16:07:04 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 76912 invoked by uid 99); 27 Sep 2017 16:07:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Sep 2017 16:07:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3429CDE128 for ; Wed, 27 Sep 2017 16:07:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 8ak251ZR8x3A for ; Wed, 27 Sep 2017 16:07:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 1C6E86125E for ; Wed, 27 Sep 2017 16:07:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6399CE061C for ; Wed, 27 Sep 2017 16:07:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9D4BA242B3 for ; Wed, 27 Sep 2017 16:07:00 +0000 (UTC) Date: Wed, 27 Sep 2017 16:07:00 +0000 (UTC) From: "Vlad Rozov (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (DRILL-5813) A query that includes sort loses Drill connection. Drill sometimes crashes. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 27 Sep 2017 16:07:06 -0000 [ https://issues.apache.org/jira/browse/DRILL-5813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Rozov resolved DRILL-5813. ------------------------------- Resolution: Not A Bug Fix Version/s: (was: 1.12.0) When Drill is started with assertion enabled (-ea) it tracks every memory allocation and the option requires more memory. See if the same is reproducible with assertion disabled (the option that will be used in production). > A query that includes sort loses Drill connection. Drill sometimes crashes. > ---------------------------------------------------------------------------- > > Key: DRILL-5813 > URL: https://issues.apache.org/jira/browse/DRILL-5813 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.11.0 > Reporter: Robert Hou > Assignee: Vlad Rozov > Attachments: drillbit.log, drill.log > > > Query is: > {noformat} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.enable_decimal_data_type` = true; > select count(*) from (select * from dfs.`/drill/testdata/resource-manager/all_types_large` order by missing11) d where d.missing3 is false; > {noformat} > This query has passed before when the number of threads and amount of memory is restricted. With more threads and memory, the query does not complete execution. > Here is the stack trace: > {noformat} > Exception occurred with closed channel. Connection: /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384) > at oadd.io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311) > at oadd.io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:407) > at oadd.io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:32) > at oadd.io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:792) > at oadd.io.netty.buffer.MutableWrappedByteBuf.setBytes(MutableWrappedByteBuf.java:280) > at oadd.io.netty.buffer.ExpandableByteBuf.setBytes(ExpandableByteBuf.java:26) > at oadd.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) > at oadd.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:241) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:745) > User Error Occurred: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down? > oadd.org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed un > expectedly. Drillbit down? > [Error Id: b97704a4-b8f0-4cd0-b428-2cf1bcf39a1d ] > at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > at oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > at oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:745) > [#14] Query failed: > oadd.org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /10.10.100.190:59281 <--> /10.10.100.190:31010 (user client) closed un > expectedly. Drillbit down? > [Error Id: b97704a4-b8f0-4cd0-b428-2cf1bcf39a1d ] > at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) > at oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) > at oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406) > at oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82) > at oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943) > at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592) > at oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89) > at oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:745) > SYSTEM ERROR: IllegalArgumentException: Attempted to send a message when connection is no longer valid. > Query submission to Drillbit failed. > [Error Id: 72a44d5e-0e4a-4343-ac78-ff4bb1cb1219 ] > oadd.org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalArgumentException: Attempted to send a message when connection is no longer va > lid. > Query submission to Drillbit failed. > [Error Id: 72a44d5e-0e4a-4343-ac78-ff4bb1cb1219 ] > at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener.failed(QueryResultHandler.java:294) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:124) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:91) > at oadd.org.apache.drill.exec.rpc.BasicClient.send(BasicClient.java:174) > at oadd.org.apache.drill.exec.rpc.user.UserClient.submitQuery(UserClient.java:129) > at oadd.org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:794) > at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:543) > at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) > at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) > at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) > at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) > at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) > at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:109) > at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) > at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) > at org.apache.drill.test.framework.DrillTestJdbc.executeSetupQuery(DrillTestJdbc.java:181) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:139) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: oadd.org.apache.drill.exec.rpc.RpcException: Failure sending message. > ... 22 more > Caused by: java.lang.IllegalArgumentException: Attempted to send a message when connection is no longer valid. > at oadd.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) > at oadd.org.apache.drill.exec.rpc.RequestIdMap.createNewRpcListener(RequestIdMap.java:88) > at oadd.org.apache.drill.exec.rpc.AbstractRemoteConnection.createNewRpcListener(AbstractRemoteConnection.java:162) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:117) > ... 21 more > [#15] Query failed: > oadd.org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalArgumentException: Attempted to send a message when connection is no longer va > lid. > Query submission to Drillbit failed. > [Error Id: 72a44d5e-0e4a-4343-ac78-ff4bb1cb1219 ] > at oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener.failed(QueryResultHandler.java:294) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:124) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:91) > at oadd.org.apache.drill.exec.rpc.BasicClient.send(BasicClient.java:174) > at oadd.org.apache.drill.exec.rpc.user.UserClient.submitQuery(UserClient.java:129) > at oadd.org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:794) > at org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:543) > at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895) > at org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61) > at oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473) > at org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100) > at oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477) > at org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181) > at oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:109) > at oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130) > at org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112) > at org.apache.drill.test.framework.DrillTestJdbc.executeSetupQuery(DrillTestJdbc.java:181) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:139) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: oadd.org.apache.drill.exec.rpc.RpcException: Failure sending message. > ... 22 more > Caused by: java.lang.IllegalArgumentException: Attempted to send a message when connection is no longer valid. > at oadd.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) > at oadd.org.apache.drill.exec.rpc.RequestIdMap.createNewRpcListener(RequestIdMap.java:88) > at oadd.org.apache.drill.exec.rpc.AbstractRemoteConnection.createNewRpcListener(AbstractRemoteConnection.java:162) > at oadd.org.apache.drill.exec.rpc.RpcBus.send(RpcBus.java:117) > ... 21 more > Failed while running cleanup query. Not returning connection to pool. > org.apache.drill.jdbc.AlreadyClosedSqlException: ResultSet is already closed. > at org.apache.drill.jdbc.impl.DrillResultSetImpl.throwIfClosed(DrillResultSetImpl.java:96) > at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:128) > at org.apache.drill.test.framework.DrillTestJdbc.executeSetupQuery(DrillTestJdbc.java:185) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:139) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The first error in drillbit.log: > {noformat} > 2017-09-22 13:18:51,720 [BitServer-3] WARN o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc type 1 took longer than 500ms. Actual duration was 197783ms. > 2017-09-22 13:25:51,944 [BitServer-5] WARN o.a.drill.exec.rpc.data.DataServer - Message of mode PING of rpc type 0 took longer than 500ms. Actual duration was 394057ms. > 2017-09-22 13:38:40,957 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.100.190:31012 <--> /10.10.100.190:43721 (data server). Closing connection. > java.lang.OutOfMemoryError: Java heap space > 2017-09-22 13:38:53,698 [UserServer-1] ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication. Connection: /10.10.100.190:31010 <--> /10.10.100.190:59281 (user server). Closing connection. > java.lang.OutOfMemoryError: Java heap space > 2017-09-22 13:42:14,255 [UserServer-1] DEBUG o.a.drill.exec.work.foreman.Foreman - 263aa12c-a069-ef76-02ff-6314ff72dca7: State change requested RUNNING --> CANCELLATION_REQUESTED > 2017-09-22 13:46:43,615 [263aa12c-a069-ef76-02ff-6314ff72dca7:frag:2:1] ERROR o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, exiting. Information message: Unable to handle out of memory condition in FragmentExecutor. > java.lang.OutOfMemoryError: Java heap space > 2017-09-22 13:52:25,011 [Drillbit-ShutdownHook#0] INFO o.apache.drill.exec.server.Drillbit - Received shutdown request. > 2017-09-22 13:52:43,818 [Drillbit-ShutdownHook#0] DEBUG o.apache.drill.exec.server.Drillbit - Shutdown begun. > 2017-09-22 13:53:00,684 [BitClient-1] ERROR o.apache.drill.exec.rpc.BasicClient - Unable to maintain connection /10.10.100.190:43721 <--> qa-node190.qa.lab/10.10.100.190:31012 (data client). Closing connection. > {noformat} > sing cln q47 -- This message was sent by Atlassian JIRA (v6.4.14#64029)