hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14474) DeadLock in RpcClientImpl.Connection.close()
Date Tue, 13 Oct 2015 18:37:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955434#comment-14955434
] 

Andrew Purtell commented on HBASE-14474:
----------------------------------------

What is the current state of your tests [~enis] / [~ted_yu]?

I tried to follow the above and have concerns. If these changes didn't fix the tests or introduced
new problems I prefer we back everything out and try again once we have a complete solution.
Or, if tests are passing, let's resolve this. 


> DeadLock in RpcClientImpl.Connection.close() 
> ---------------------------------------------
>
>                 Key: HBASE-14474
>                 URL: https://issues.apache.org/jira/browse/HBASE-14474
>             Project: HBase
>          Issue Type: Bug
>          Components: rpc
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Blocker
>             Fix For: 2.0.0, 1.2.0, 1.0.3, 1.1.3
>
>         Attachments: 14474-addendum-2.txt, 14474-addendum.txt, hbase-14474_v1.patch,
hbase-14474_v2.patch, hbase-14474_v3.patch
>
>
> From a code base that contains 1.1.2 + HBASE-14449 + HBASE-14241 and HBASE-14313 we can
reproduce a dead lock with serverKilling CM easily: 
> {code}
> Found one Java-level deadlock:
> =============================
> "htable-pool1-t63":
>   waiting to lock monitor 0x0000000001cb1688 (object 0x00000000806ef150, a org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection),
>   which is held by "IPC Client (1403704789) connection to enis-hbase-sep-21-6.novalocal/172.22.107.106:16020
from root"
> "IPC Client (1403704789) connection to enis-hbase-sep-21-6.novalocal/172.22.107.106:16020
from root":
>   waiting to lock monitor 0x0000000001cb1738 (object 0x00000000806f0c60, a java.lang.Object),
>   which is held by "htable-pool1-t63"
> Java stack information for the threads listed above:
> ===================================================
> "htable-pool1-t63":
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:819)
> 	- waiting to lock <0x00000000806ef150> (a org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection)
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
> 	- locked <0x00000000806f0c60> (a java.lang.Object)
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856)
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1192)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:32699)
> 	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:129)
> 	at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:54)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> 	at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:708)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> "IPC Client (1403704789) connection to enis-hbase-sep-21-6.novalocal/172.22.107.106:16020
from root":
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:832)
> 	- waiting to lock <0x00000000806f0c60> (a java.lang.Object)
> 	- locked <0x00000000806ef150> (a org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection)
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:574)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message