drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khurram Faraaz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3751) Query hang when zookeeper is stopped
Date Tue, 08 Sep 2015 19:00:46 GMT
Khurram Faraaz created DRILL-3751:
-------------------------------------

             Summary: Query hang when zookeeper is stopped
                 Key: DRILL-3751
                 URL: https://issues.apache.org/jira/browse/DRILL-3751
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.2.0
         Environment: 4 node cluster on CentOS
            Reporter: Khurram Faraaz
            Assignee: Chris Westin
             Fix For: 1.2.0


I see an indefinite hang on sqlline prompt, issue a long running query and then stop zookeeper
process when the query is still being executed. Sqlline prompt is never returned and it hangs
showing the below stack trace. I am on master.

Steps to reproduce the problem
clush -g khurram service mapr-warden stop
clush -g khurram service mapr-warden start
Issue long running query from sqlline
While query is running, stop zookeeper using script.

To stop zookeeper 
{code}
[root@centos-01 bin]# ./zkServer.sh stop
JMX enabled by default
Using config: /opt/mapr/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
{code}

Issue below long running query from sqlline
{code}
./sqlline -u "jdbc:drill:schema=dfs.tmp"
0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 8000000;
...
| 7.40907649723E8  | g    |
| 1.12378007695E9  | d    |
03:03:28.482 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - Connection timed
out for connection string (10.10.100.201:5181) and timeout (5000) / elapsed (5013)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
	at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) [curator-client-2.5.0.jar:na]
	at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) [curator-client-2.5.0.jar:na]
	at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
[curator-client-2.5.0.jar:na]
	at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807)
[curator-framework-2.5.0.jar:na]
	at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793)
[curator-framework-2.5.0.jar:na]
	at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
[curator-framework-2.5.0.jar:na]
	at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
[curator-framework-2.5.0.jar:na]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_45]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
{code}

Here is the stack for sqlline process

{code}
[root@centos-01 bin]# /usr/java/jdk1.7.0_45/bin/jstack 32136
2015-09-05 03:21:52
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f8328003800 nid=0x27f1 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CuratorFramework-0-EventThread" daemon prio=10 tid=0x00000000012fd800 nid=0x26e1 waiting
on condition [0x00007f8317c2e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000007e2117798> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)

"CuratorFramework-0-SendThread(centos-01.qa.lab:5181)" daemon prio=10 tid=0x0000000001109800
nid=0x26e0 waiting on condition [0x00007f8317b2d000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.zookeeper.client.StaticHostProvider.next(StaticHostProvider.java:86)
	at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:937)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:995)

"threadDeathWatcher-2-1" daemon prio=10 tid=0x00007f833043b800 nid=0x7e16 waiting on condition
[0x00007f831751f000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
	at java.lang.Thread.run(Thread.java:744)

"Client-1" daemon prio=10 tid=0x00007f8378df7000 nid=0x7e15 runnable [0x00007f8317620000]
   java.lang.Thread.State: RUNNABLE
	at io.netty.channel.epoll.Native.epollWait0(Native Method)
	at io.netty.channel.epoll.Native.epollWait(Native.java:148)
	at io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:180)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:205)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	at java.lang.Thread.run(Thread.java:744)

"ServiceCache-0" daemon prio=10 tid=0x00007f8378d22000 nid=0x7e13 waiting on condition [0x00007f831792b000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000006fff9c658> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)

"CuratorFramework-0" daemon prio=10 tid=0x00007f8378c95800 nid=0x7e12 waiting on condition
[0x00007f8317a2c000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000006fff9ebd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
	at java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
	at java.util.concurrent.DelayQueue.take(DelayQueue.java:68)
	at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:781)
	at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
	at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)

"ConnectionStateManager-0" daemon prio=10 tid=0x00007f8378c60800 nid=0x7e0f waiting on condition
[0x00007f8317d2f000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000006fffb2288> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374)
	at org.apache.curator.framework.state.ConnectionStateManager.processEvents(ConnectionStateManager.java:208)
	at org.apache.curator.framework.state.ConnectionStateManager.access$000(ConnectionStateManager.java:42)
	at org.apache.curator.framework.state.ConnectionStateManager$1.call(ConnectionStateManager.java:110)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)

"NonBlockingInputStreamThread" daemon prio=10 tid=0x00007f8378836000 nid=0x7de0 in Object.wait()
[0x00007f83186ab000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000006fffb2438> (a jline.internal.NonBlockingInputStream)
	at jline.internal.NonBlockingInputStream.run(NonBlockingInputStream.java:278)
	- locked <0x00000006fffb2438> (a jline.internal.NonBlockingInputStream)
	at java.lang.Thread.run(Thread.java:744)

"Service Thread" daemon prio=10 tid=0x00007f83780c1000 nid=0x7dcd runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007f83780be800 nid=0x7dcc waiting on condition
[0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007f83780bb800 nid=0x7dcb waiting on condition
[0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007f83780b1800 nid=0x7dca runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007f837809a800 nid=0x7dc9 in Object.wait() [0x00007f832c574000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000006fffb2668> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x00000006fffb2668> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

"Reference Handler" daemon prio=10 tid=0x00007f8378091000 nid=0x7dc8 in Object.wait() [0x00007f832c675000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000006fffb2700> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:503)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
	- locked <0x00000006fffb2700> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00007f8378011000 nid=0x7db4 waiting on condition [0x00007f837cac2000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x0000000700d3a210> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
	at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
	at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
	at org.apache.drill.jdbc.impl.DrillResultSetImpl$ResultsListener.getNext(DrillResultSetImpl.java:1536)
	at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:175)
	at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:320)
	at net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
	at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:161)
	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:62)
	at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
	at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
	at sqlline.SqlLine.print(SqlLine.java:1583)
	at sqlline.Commands.execute(Commands.java:852)
	at sqlline.Commands.sql(Commands.java:751)
	at sqlline.SqlLine.dispatch(SqlLine.java:738)
	at sqlline.SqlLine.begin(SqlLine.java:612)
	at sqlline.SqlLine.start(SqlLine.java:366)
	at sqlline.SqlLine.main(SqlLine.java:259)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message