Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39F3091D3 for ; Wed, 11 Apr 2012 00:07:45 +0000 (UTC) Received: (qmail 43378 invoked by uid 500); 11 Apr 2012 00:07:44 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 43332 invoked by uid 500); 11 Apr 2012 00:07:44 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 43255 invoked by uid 99); 11 Apr 2012 00:07:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Apr 2012 00:07:44 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Apr 2012 00:07:41 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 9EF85364BEF for ; Wed, 11 Apr 2012 00:07:20 +0000 (UTC) Date: Wed, 11 Apr 2012 00:07:20 +0000 (UTC) From: "Todd Lipcon (Updated) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1679970621.10238.1334102840667.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <2024739814.15210.1333587322121.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HADOOP-8247) Auto-HA: add a config to enable auto-HA, which disables manual FC MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-8247: -------------------------------- Attachment: hadoop-8247.txt Attached patch implements the above. Here is a transcript of its usage: $ ./bin/hdfs haadmin -transitionToStandby nn1 -forcemanual You have specified the forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) n 12/04/10 17:05:53 FATAL ha.HAAdmin: Aborted todd@todd-w510:~/git/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT$ ./bemanual haadmin -transitionToStandby nn1 -force You have specified the forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) y 12/04/10 17:02:05 WARN ha.HAAdmin: Proceeding with manual HA state management even though automatic failover is enabled for NameNode at todd-w510/127.0.0.1:8021 > Auto-HA: add a config to enable auto-HA, which disables manual FC > ----------------------------------------------------------------- > > Key: HADOOP-8247 > URL: https://issues.apache.org/jira/browse/HADOOP-8247 > Project: Hadoop Common > Issue Type: Improvement > Components: auto-failover, ha > Affects Versions: Auto Failover (HDFS-3042) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Attachments: hadoop-8247.txt, hadoop-8247.txt, hadoop-8247.txt, hadoop-8247.txt, hadoop-8247.txt > > > Currently, if automatic failover is set up and running, and the user uses the "haadmin -failover" command, he or she can end up putting the system in an inconsistent state, where the state in ZK disagrees with the actual state of the world. To fix this, we should add a config flag which is used to enable auto-HA. When this flag is set, we should disallow use of the haadmin command to initiate failovers. We should refuse to run ZKFCs when the flag is not set. Of course, this flag should be scoped by nameservice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira