hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-14284) In TRUNK, AsyncRpcClient does not timeout; hangs TestDistributedLogReplay, etc.
Date Sat, 22 Aug 2015 00:56:47 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HBASE-14284.
---------------------------
    Resolution: Invalid

I have this wrong. The asyncrpcclient HAS timeouts. It just looked like it didn't. Adding
in logging had me wrong. Resolving as invalid. The actual issue is blocked handlers. Will
open new issue.

> In TRUNK, AsyncRpcClient does not timeout; hangs TestDistributedLogReplay, etc.
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-14284
>                 URL: https://issues.apache.org/jira/browse/HBASE-14284
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>
> TestDistributedLogReplay puts up regionservers with *40* priority handlers each. This
makes for TDLR running with many hundreds of threads. Trying to figure why 40, I see the test
can hang if less with all client use stuck never timing out:
> {code}
> "RS:2;localhost:58498" prio=5 tid=0x00007fd284d4e800 nid=0x416af in Object.wait() [0x000000012952e000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:461)
> 	at io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:355)
> 	- locked <0x00000007dff93ea0> (a org.apache.hadoop.hbase.ipc.AsyncCall)
> 	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:266)
> 	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:42)
> 	at org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:231)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:214)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:288)
> 	at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerReport(RegionServerStatusProtos.java:8994)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1148)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:957)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:356)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
> 	at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:279)
> 	at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
> 	at java.lang.Thread.run(Thread.java:744)
> {code}
> We  never recover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message