hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
Date Fri, 11 Sep 2015 11:17:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740594#comment-14740594
] 

Ted Yu commented on HBASE-14370:
--------------------------------

TestAccessController3 seems to hang in branch-1.
Here is part of stack trace:
{code}
"main" prio=5 tid=0x00007ff522800000 nid=0x1903 in Object.wait() [0x00000001098f7000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  - waiting on <0x00000007c6a63358> (a java.util.concurrent.atomic.AtomicBoolean)
  at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:168)
  - locked <0x00000007c6a63358> (a java.util.concurrent.atomic.AtomicBoolean)
  at org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:95)
  at org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callBlockingMethod(CoprocessorRpcChannel.java:73)
  at org.apache.hadoop.hbase.protobuf.generated.AccessControlProtos$AccessControlService$BlockingStub.grant(AccessControlProtos.java:10280)
  at org.apache.hadoop.hbase.protobuf.ProtobufUtil.grant(ProtobufUtil.java:2181)
  at org.apache.hadoop.hbase.security.access.SecureTestUtil$2.call(SecureTestUtil.java:375)
  at org.apache.hadoop.hbase.security.access.SecureTestUtil$2.call(SecureTestUtil.java:367)
  at org.apache.hadoop.hbase.security.access.SecureTestUtil.updateACLs(SecureTestUtil.java:332)
  at org.apache.hadoop.hbase.security.access.SecureTestUtil.grantGlobal(SecureTestUtil.java:367)
  at org.apache.hadoop.hbase.security.access.TestAccessController3.setUpTableAndUserPermissions(TestAccessController3.java:243)
  at org.apache.hadoop.hbase.security.access.TestAccessController3.setupBeforeClass(TestAccessController3.java:187)
{code}

> Use separate thread for calling ZKPermissionWatcher#refreshNodes()
> ------------------------------------------------------------------
>
>                 Key: HBASE-14370
>                 URL: https://issues.apache.org/jira/browse/HBASE-14370
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, 14370-v3.txt,
14370-v5.txt, 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, 14370-wait-nofity.txt,
hbase-14370_v4.patch
>
>
> I came off a support case (0.98.0) where main zk thread was seen doing the following:
> {code}
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121)
>   at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348)
>   at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> {code}
> There were 62000 nodes under /acl due to lack of fix from HBASE-12635, leading to slowness
in table creation because zk notification for region offline was blocked by the above.
> The attached patch separates refreshNodes() call into its own thread.
> Thanks to Enis and Devaraj for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message