hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()
Date Fri, 21 Sep 2012 23:50:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460948#comment-13460948

Gary Helmling commented on HBASE-6851:

bq. Cache is unbounded. Any fear of it getting large? (I suppose the old code worked this
way so this patch doesn't introduce that issue).

Making the cache bounded/not all memory resident would be a pretty significant change and
have potentially big (negative) impacts on performance.  There may be some cases of many tables
+ many users where this may start to become a problem.  But for the expected usage (maybe
hundreds of tables, hundreds of users), I wouldn't expect this to be much of an issue.

bq. Are all accesses on GLOBAL_CACHE synchronized? Do they need to be? Could we get a NPE
again when we swap it in and are accessing it elsewhere concurrently?

No, accesses are not synchronized (and I don't think we want them to be for performance reasons).
 Since we're rebuilding and re-assigning the variable reference, I don't think we're exposed
to an NPE here.

bq. Does updateTableCache need synchronize?

The table cache is a ConcurrentSkipListMap so we should be able to reassign variables atomically.
 But maybe we do want to serialize calls to this as well?  Seems like both updateGlobalCache()
and updateTableCache() need to be synchronized, or neither does.  Since these are only called
from the associated ZKPermissionWatcher/ZooKeeperWatcher, is there any need to even synchronize
these to force serialization of the calls?  Does the ZK ordering of the watch events give
us this already?  If I understand correctly, maybe we don't need the synchronization at all.

Should we make the GLOBAL_CACHE reference volatile, though, to ensure all threads see it when
> Race condition in TableAuthManager.updateGlobalCache()
> ------------------------------------------------------
>                 Key: HBASE-6851
>                 URL: https://issues.apache.org/jira/browse/HBASE-6851
>             Project: HBase
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.94.1, 0.96.0
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>            Priority: Critical
>         Attachments: HBASE-6851.patch
> When new global permissions are assigned, there is a race condition, during which further
authorization checks relying on global permissions may fail.
> In TableAuthManager.updateGlobalCache(), we have:
> {code:java}
>     USER_CACHE.clear();
>     GROUP_CACHE.clear();
>     try {
>       initGlobal(conf);
>     } catch (IOException e) {
>       // Never happens
>       LOG.error("Error occured while updating the user cache", e);
>     }
>     for (Map.Entry<String,TablePermission> entry : userPerms.entries()) {
>       if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
>         GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
>                         new Permission(entry.getValue().getActions()));
>       } else {
>         USER_CACHE.put(entry.getKey(), new Permission(entry.getValue().getActions()));
>       }
>     }
> {code}
> If authorization checks come in following the .clear() but before repopulating, they
will fail.
> We should have some synchronization here to serialize multiple updates and use a COW
type rebuild and reassign of the new maps.
> This particular issue crept in with the fix in HBASE-6157, so I'm flagging for 0.94 and

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message