hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()
Date Tue, 08 Sep 2015 21:58:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735703#comment-14735703
] 

Enis Soztutar commented on HBASE-14370:
---------------------------------------

bq. w.r.t. thread leak, have you seen the following code ?
Ok, missed that. 
bq. Do you think tighter coordination is needed between the zk thread and the refresher thread
?
In theory, there maybe a race where one thread is refreshing the table auths, while the other
is deleting that permission since now, they will be executing in different threads. Maybe
we can make every operation (nodeCreated,nodeDeleted,nodeDataChanged,nodeChildrenChanged)
to execute from the executor.  

> Use separate thread for calling ZKPermissionWatcher#refreshNodes()
> ------------------------------------------------------------------
>
>                 Key: HBASE-14370
>                 URL: https://issues.apache.org/jira/browse/HBASE-14370
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 14370-v1.txt, 14370-v3.txt
>
>
> I came off a support case (0.98.0) where main zk thread was seen doing the following:
> {code}
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135)
>   at org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121)
>   at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348)
>   at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> {code}
> There were 62000 nodes under /acl due to lack of fix from HBASE-12635, leading to slowness
in table creation because zk notification for region offline was blocked by the above.
> The attached patch separates refreshNodes() call into its own thread.
> Thanks to Enis and Devaraj for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message