hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongzhi Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled
Date Mon, 01 May 2017 23:13:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991745#comment-15991745
] 

Yongzhi Chen commented on HIVE-15997:
-------------------------------------

657	      if (lDrvState.driverState == DriverState.INTERRUPT) {	657	      if (lDrvState.driverState
== DriverState.INTERRUPT) {
658	        Thread.currentThread().interrupt();		
659	        return true;	658	        return true;

Remove Thread.currentThread().interrupt();	 // can solve some of the resource leaks (depend
on the cancel time), it let the query closed gracefully instead of being interrupted during
files clean up. 
And 
The fixes in zookeeper code can fix the lock leaks in my test case. 



> Resource leaks when query is cancelled 
> ---------------------------------------
>
>                 Key: HIVE-15997
>                 URL: https://issues.apache.org/jira/browse/HIVE-15997
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yongzhi Chen
>            Assignee: Yongzhi Chen
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15997.1.patch
>
>
> There may some resource leaks when query is cancelled.
> We see following stacks in the log:
> Possible files and folder leak: 
> {noformat} 
> 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: Thread-61]:
Error Removing Scratch: java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException;
Host Details : local host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination
host is: "ychencdh511t-1.vpc.cloudera.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1476) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1409) 
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)

> at com.sun.proxy.$Proxy25.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

> at java.lang.reflect.Method.invoke(Method.java:606) 
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)

> at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)

> at com.sun.proxy.$Proxy26.delete(Unknown Source) 
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) 
> at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)

> at org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)

> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

> at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)

> at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) 
> at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) 
> at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) 
> at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) 
> at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)

> at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)

> at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)

> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)

> at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) 
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: java.nio.channels.ClosedByInterruptException 
> at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)

> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) 
> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) 
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) 
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) 
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) 
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) 
> at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) 
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) 
> at org.apache.hadoop.ipc.Client.call(Client.java:1448) 
> ... 35 more 
> 2017-02-02 12:26:52,706 INFO org.apache.hive.service.cli.operation.OperationManager:
[HiveServer2-Background-Pool: Thread-23]: Operation is timed out,operation=OperationHandle
[opType=EXECUTE_STATEMENT, getHandleIdentifier()=2af82100-94cf-4f26-abaa-c4b57c57b23c],state=CANCELED

> {format} 
> Possible lock leak:
> Locks leak:
> {format}
> 2017-02-02 06:21:05,054 ERROR ZooKeeperHiveLockManager: [HiveServer2-Background-Pool:
Thread-61]: Failed to release ZooKeeper lock: 
> java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:503)
> 	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> 	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:238)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:233)
> 	at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:214)
> 	at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:41)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockPrimitive(ZooKeeperHiveLockManager.java:488)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:466)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:454)
> 	at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:236)
> 	at org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1175)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1432)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
> 	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> 	at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
> 	at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message