hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongzhi Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled
Date Mon, 06 Mar 2017 21:36:33 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898157#comment-15898157
] 

Yongzhi Chen commented on HIVE-15997:
-------------------------------------

[~ctang.ma]
Remove Thread.currentThread().interrupt(); from isInterrupted()  is just to avoid setting
the interrupted flag too early and many times. The Tasks will be stopped by DriverContext's
shutdown() method.  The method will call  thread.interrupt();  So TezTask will not affect
by the change, and the queries use the TezTask can benefit with the change. 
Other changes in ExecDriver may speed up query response time for cancelling when running with
MR. 

For ZooKeeper catch InterruptedException:
When the InterruptedException is thrown, the thread is already interrupted. It should not
be interrupted again. As to my tests, the second time succeed. That mean, when the InterruptedExcetion
is thrown for this zookeeper case, the interrupted status is cleared. 





> Resource leaks when query is cancelled 
> ---------------------------------------
>
>                 Key: HIVE-15997
>                 URL: https://issues.apache.org/jira/browse/HIVE-15997
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yongzhi Chen
>            Assignee: Yongzhi Chen
>         Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak:
> {noformat}
> 2017-02-02 06:23:25,410 WARN  hive.ql.Context: [HiveServer2-Background-Pool: Thread-61]:
Error Removing Scratch: java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException;
Host Details : local host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination
host is: "ychencdh511t-1.vpc.cloudera.com":8020; 
> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1409)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> 	at com.sun.proxy.$Proxy25.delete(Unknown Source)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> 	at com.sun.proxy.$Proxy26.delete(Unknown Source)
> 	at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
> 	at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405)
> 	at org.apache.hadoop.hive.ql.Context.clear(Context.java:541)
> 	at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109)
> 	at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
> 	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> 	at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.ClosedByInterruptException
> 	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> 	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681)
> 	at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> 	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615)
> 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714)
> 	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376)
> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> 	... 35 more
> 2017-02-02 12:26:52,706 INFO  org.apache.hive.service.cli.operation.OperationManager:
[HiveServer2-Background-Pool: Thread-23]: Operation is timed out,operation=OperationHandle
[opType=EXECUTE_STATEMENT, getHandleIdentifier()=2af82100-94cf-4f26-abaa-c4b57c57b23c],state=CANCELED
> {format}
> Locks leak:
> {format}
> 2017-02-02 06:21:05,054 ERROR ZooKeeperHiveLockManager: [HiveServer2-Background-Pool:
Thread-61]: Failed to release ZooKeeper lock: 
> java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:503)
> 	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> 	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
> 	at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockPrimitive(ZooKeeperHiveLockManager.java:488)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:466)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:454)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:236)
> 	at org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1175)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1432)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
> 	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> 	at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message