Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Thu, 22 Mar 2012 23:20:26 +0000 (UTC)
From: "Todd Lipcon (Updated) (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: 
 <1229592118.5487.1332458426397.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: 
 <1602658885.41956.1331257556973.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Updated] (HDFS-3071) haadmin failover command does not
 provide enough detail for when target NN is not ready to be active
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HDFS-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HDFS-3071:
------------------------------

    Attachment: hdfs-3071.txt

Found one more issue in manual testing, which made me go back and add automated tests for this feature. I fixed TestDFSHAAdminMiniCluster to actually record the error output, and added an assertion to check that it's correct for the safemode case. Also tested locally. I ran all the HA tests in both common and HDFS as well.
                
> haadmin failover command does not provide enough detail for when target NN is not ready to be active
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3071
>                 URL: https://issues.apache.org/jira/browse/HDFS-3071
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 0.24.0
>            Reporter: Philip Zeyliger
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt
>
>
> When running the failover command, you can get an error message like the following:
> {quote}
> $ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
> Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
> {quote}
> Unfortunately, the error message doesn't describe why that node isn't ready to be active.  In my case, the target namenode's logs don't indicate anything either. It turned out that the issue was "Safe mode is ON.Resources are low on NN. Safe mode must be turned off manually.", but ideally the user would be told that at the time of the failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira