hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1222) Make improvements in ZKRMStateStore for fencing
Date Mon, 04 Nov 2013 07:49:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13812663#comment-13812663

Karthik Kambatla commented on YARN-1222:

bq. deleteWithRetries() - The new logic does not seem identical to the older logic.
Good catch! It is not identical, but should have the same effect. Instead of using a ZKAction
that does both exist() and delete(), the second patch was doing exist first (with retries)
and then delete which can lead to funny behavior. Hence, the return null.

bq. why do we do an exists check instead of catching the NoNodeException after calling delete
Agree deleting and catching NoNodeException makes more sense. The exists() followed by delete()
comes from earlier patches on YARN-353, and I am not sure the reasoning behind it. We don't
seem to be using the watch anywhere. The updated patch (-3.patch) adopts this approach.

bq. Should this method only change the ACL's pertaining to other RM instances and not every
ACL on the znode? Alternatively, if the assumption is that the root znode etc are manipulated
only by the RM's then can we simply remove all older ACL's and set a new ACL for the current
Only the RMs manipulate the znode structure. Here, we are using the ZKStoreACLs to generate
the ZKRootNodeACLs. IIUC, the ZKStoreACLs define who has access to the store. That could be
used to define the users that can run the RM with ZKStateStore; an example could be that only
adminuser is allowed. The patch provides adminuser as much access on the ZKRootNode as possible,
removing create-delete permissions. In addition to that, it gives exclusive create-delete
perms to the RM on ZKRootNode. I think this is required if we want to support this behavior
even for an ACL list where say each element gives one user access to run the RM. One extreme
case could be two RMs running - each started by a different user with both users being allowed
by the ACLs. 

Does that sound reasonable? Should we handle it any other way?

> Make improvements in ZKRMStateStore for fencing
> -----------------------------------------------
>                 Key: YARN-1222
>                 URL: https://issues.apache.org/jira/browse/YARN-1222
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: yarn-1222-1.patch, yarn-1222-2.patch, yarn-1222-3.patch
> Using multi-operations for every ZK interaction. 
> In every operation, automatically creating/deleting a lock znode that is the child of
the root znode. This is to achieve fencing by modifying the create/delete permissions on the
root znode.

This message was sent by Atlassian JIRA

View raw message