Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 19693E578 for ; Thu, 6 Dec 2012 02:00:59 +0000 (UTC) Received: (qmail 93374 invoked by uid 500); 6 Dec 2012 02:00:58 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 93341 invoked by uid 500); 6 Dec 2012 02:00:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 93332 invoked by uid 99); 6 Dec 2012 02:00:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Dec 2012 02:00:58 +0000 Date: Thu, 6 Dec 2012 02:00:58 +0000 (UTC) From: "Harsh J (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1461108814.66413.1354759258825.JavaMail.jiratomcat@arcas> In-Reply-To: <967526869.66171.1354755298699.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HDFS-4281) NameNode recovery does not detect NN RPC address on HA cluster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511023#comment-13511023 ] Harsh J commented on HDFS-4281: ------------------------------- Hi, Is this a duplicate of Colin's HDFS-4279? > NameNode recovery does not detect NN RPC address on HA cluster > -------------------------------------------------------------- > > Key: HDFS-4281 > URL: https://issues.apache.org/jira/browse/HDFS-4281 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.0.0-alpha > Reporter: Stephen Chu > Attachments: core-site.xml, hdfs-site.xml, nn_recover > > > On a shut down HA cluster, I ran "hdfs namenode -recover" and encountered: > {code} > You have selected Metadata Recovery mode. This mode is intended to recover lost metadata on a corrupt filesystem. Metadata recovery mode often permanently deletes data from your HDFS filesystem. Please back up\ > your edit log and fsimage before trying this! > Are you ready to proceed? (Y/N) > (Y or N) Y > 12/12/05 16:43:48 INFO namenode.MetaRecoveryContext: starting recovery... > 12/12/05 16:43:48 WARN common.Util: Path /dfs/nn should be specified as a URI in configuration files. Please update hdfs configuration. > 12/12/05 16:43:48 WARN common.Util: Path /dfs/nn should be specified as a URI in configuration files. Please update hdfs configuration. > 12/12/05 16:43:48 WARN namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of dataloss due to lack of redundant storage directories! > 12/12/05 16:43:48 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list > 12/12/05 16:43:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 > 12/12/05 16:43:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=true > 12/12/05 16:43:48 INFO blockmanagement.BlockManager: dfs.block.access.key.update.interval=600 min(s), dfs.block.access.token.lifetime=600 min(s), dfs.encrypt.data.transfer.algorithm=null > 12/12/05 16:43:48 INFO namenode.MetaRecoveryContext: RECOVERY FAILED: caught exception > java.lang.IllegalStateException: Could not determine own NN ID in namespace 'ha-nn-uri'. Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id > at com.google.common.base.Preconditions.checkState(Preconditions.java:172) > at org.apache.hadoop.hdfs.HAUtil.getNameNodeIdOfOtherNode(HAUtil.java:155) > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createBlockTokenSecretManager(BlockManager.java:323) > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.(BlockManager.java:239) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:451) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:416) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386) > at org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1063) > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1135) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > 12/12/05 16:43:48 FATAL namenode.NameNode: Exception in namenode join > java.lang.IllegalStateException: Could not determine own NN ID in namespace 'ha-nn-uri'. Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id > at com.google.common.base.Preconditions.checkState(Preconditions.java:172) > at org.apache.hadoop.hdfs.HAUtil.getNameNodeIdOfOtherNode(HAUtil.java:155) > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createBlockTokenSecretManager(BlockManager.java:323) > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.(BlockManager.java:239) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:451) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:416) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386) > at org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1063) > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1135) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > 12/12/05 16:43:48 INFO util.ExitUtil: Exiting with status 1 > 12/12/05 16:43:48 INFO namenode.NameNode: SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NameNode at cs-10-20-193-228.cloud.cloudera.com/10.20.193.228 > ************************************************************/ > {code} > The exception message says > {code} > Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id > {code} > I ran the recover command from a machine listed as an NN RPC: > {code} > > dfs.namenode.rpc-address.ha-nn-uri.nn1 > cs-10-20-193-228.cloud.cloudera.com:17020 > > {code} > Setting dfs.ha.namenode.id allows me to proceed. If we always need to specify the dfs.ha.namenode.id, then we can edit the exception message. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira