Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8379E1063C for ; Tue, 5 Nov 2013 18:23:18 +0000 (UTC) Received: (qmail 5727 invoked by uid 500); 5 Nov 2013 18:23:18 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 5679 invoked by uid 500); 5 Nov 2013 18:23:18 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 5670 invoked by uid 99); 5 Nov 2013 18:23:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Nov 2013 18:23:17 +0000 Date: Tue, 5 Nov 2013 18:23:17 +0000 (UTC) From: "Bikas Saha (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1222) Make improvements in ZKRMStateStore for fencing MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814092#comment-13814092 ] Bikas Saha commented on YARN-1222: ---------------------------------- bq. The root-node ACLs are per RM instance. They need to be different for it to work. The documentation in yarn-default.xml explains this - we might have to make it even more clear? Clarifying it, possibly with an example would be good. bq. The number of ACLs in the list is always bounded by (user-configured-for-store + 1). Am I missing something? I missed that the patch is modifying the base acl from config and not the actual acl from the znode. The latter would have increased the count. The former is fine. The current code is good. Where is the shared rm-admin-acl being set such that both RMs have admin access to the root znode? This probably works because the default is world:all. But if that is not the case, and we are using internally generated acls, then the rm has to give shared admin access to the other rm when it creates the root znode, right? bq. Do you think we should make it aware of fencing - have something like a StoreFencedException? I think it should be aware of when the store is not available to it because it has been fenced out. There are/were comments in state store error handling to differentiate between exceptions when we have such a differentiation. So we should create a Fenced exception (look at HDFS code for an example). This way all state store should be able to return this incident for identical handling in the upper layers. We would like to avoid state store impls (which are technically runtime pluggable pieces) to have to understand internal Hadoop code patterns for HA etc. bq. ZKRMStateStore itself is @Private @Unstable. Should we still label the methods @Private? At some point ZKRMStateStore will become public/stable but these methods should remain private for testing, right? > Make improvements in ZKRMStateStore for fencing > ----------------------------------------------- > > Key: YARN-1222 > URL: https://issues.apache.org/jira/browse/YARN-1222 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Bikas Saha > Assignee: Karthik Kambatla > Attachments: yarn-1222-1.patch, yarn-1222-2.patch, yarn-1222-3.patch, yarn-1222-4.patch > > > Using multi-operations for every ZK interaction. > In every operation, automatically creating/deleting a lock znode that is the child of the root znode. This is to achieve fencing by modifying the create/delete permissions on the root znode. -- This message was sent by Atlassian JIRA (v6.1#6144)