hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
Date Sat, 06 Jun 2015 02:30:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Enis Soztutar updated HBASE-13851:
----------------------------------
    Attachment: hbase-13851_v1.patch

Attaching a patch which fixes the hang issue in my tests. Ran the IT 20 times without hanging.


The patch interrupts the CallSender thread as well, and in case Connection is not started,
it calls close() which will remove it from the connections list.  

> RpcClientImpl.close() can hang with cancelled replica RPCs
> ----------------------------------------------------------
>
>                 Key: HBASE-13851
>                 URL: https://issues.apache.org/jira/browse/HBASE-13851
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.2.0, 1.1.1
>
>         Attachments: hbase-13851_v1.patch
>
>
> We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}}
in 1.1 code base during the test.The jstack gives: 
> {code}
> "IPC Client (1344340481) connection to os-enis-dal-test-jun-4-1.openstacklocal/172.22.80.25:16020
from root - writer" daemon prio=10 tid=0x00007f3891b29800 nid=0x7345 waiting on condition
[0x00007f3865647000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000070d54a240> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>         at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374)
>         at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$CallSender.run(RpcClientImpl.java:253)
> "TestClient-3" prio=10 tid=0x00007f3892660800 nid=0x63b0 waiting on condition [0x00007f386ecdd000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.hbase.ipc.RpcClientImpl.close(RpcClientImpl.java:1139)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2371)
>         at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2384)
>         at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:1036)
>         at org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testTakedown(PerformanceEvaluation.java:1351)
>         at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1055)
>         at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1612)
>         at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:410)
>         at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:405)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message