hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
Date Tue, 08 Sep 2015 19:06:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735423#comment-14735423
] 

Enis Soztutar commented on HBASE-14370:
---------------------------------------

Forking a thread is expensive, so we do not use that pattern but instead use Executors with
fixed thread pools. Plus, as mentioned above, we have to make sure that the refresh requests
from different even notifications should execute in the same order that the zk notification
comes from. Single threaded executor should be able to do that. 

> Use separate thread for calling ZKPermissionWatcher#refreshNodes()
> ------------------------------------------------------------------
>
>                 Key: HBASE-14370
>                 URL: https://issues.apache.org/jira/browse/HBASE-14370
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 14370-v1.txt
>
>
> I came off a support case (0.98.0) where main zk thread was seen doing the following:
> {code}
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121)
>   at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348)
>   at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> {code}
> There were 62000 nodes under /acl due to lack of fix from HBASE-12635, leading to slowness
in table creation because zk notification for region offline was blocked by the above.
> The attached patch separates refreshNodes() call into its own thread.
> Thanks to Enis and Devaraj for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message