Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Thu, 6 Dec 2012 02:00:58 +0000 (UTC)
From: "Harsh J (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <1461108814.66413.1354759258825.JavaMail.jiratomcat@arcas>
In-Reply-To: <967526869.66171.1354755298699.JavaMail.jiratomcat@arcas>
Subject: [jira] [Commented] (HDFS-4281) NameNode recovery does not detect NN
 RPC address on HA cluster
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511023#comment-13511023 ] 

Harsh J commented on HDFS-4281:
-------------------------------

Hi,

Is this a duplicate of Colin's HDFS-4279?
                
> NameNode recovery does not detect NN RPC address on HA cluster
> --------------------------------------------------------------
>
>                 Key: HDFS-4281
>                 URL: https://issues.apache.org/jira/browse/HDFS-4281
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Stephen Chu
>         Attachments: core-site.xml, hdfs-site.xml, nn_recover
>
>
> On a shut down HA cluster, I ran "hdfs namenode -recover" and encountered:
> {code}
> You have selected Metadata Recovery mode.  This mode is intended to recover lost metadata on a corrupt filesystem.  Metadata recovery mode often permanently deletes data from your HDFS filesystem.  Please back up\
>  your edit log and fsimage before trying this!
> Are you ready to proceed? (Y/N)
>  (Y or N) Y
> 12/12/05 16:43:48 INFO namenode.MetaRecoveryContext: starting recovery...
> 12/12/05 16:43:48 WARN common.Util: Path /dfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
> 12/12/05 16:43:48 WARN common.Util: Path /dfs/nn should be specified as a URI in configuration files. Please update hdfs configuration.
> 12/12/05 16:43:48 WARN namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of dataloss due to lack of redundant storage directories!
> 12/12/05 16:43:48 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list
> 12/12/05 16:43:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
> 12/12/05 16:43:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=true
> 12/12/05 16:43:48 INFO blockmanagement.BlockManager: dfs.block.access.key.update.interval=600 min(s), dfs.block.access.token.lifetime=600 min(s), dfs.encrypt.data.transfer.algorithm=null
> 12/12/05 16:43:48 INFO namenode.MetaRecoveryContext: RECOVERY FAILED: caught exception
> java.lang.IllegalStateException: Could not determine own NN ID in namespace 'ha-nn-uri'. Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id
>         at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
>         at org.apache.hadoop.hdfs.HAUtil.getNameNodeIdOfOtherNode(HAUtil.java:155)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createBlockTokenSecretManager(BlockManager.java:323)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.<init>(BlockManager.java:239)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:451)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1063)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1135)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> 12/12/05 16:43:48 FATAL namenode.NameNode: Exception in namenode join
> java.lang.IllegalStateException: Could not determine own NN ID in namespace 'ha-nn-uri'. Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id
>         at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
>         at org.apache.hadoop.hdfs.HAUtil.getNameNodeIdOfOtherNode(HAUtil.java:155)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createBlockTokenSecretManager(BlockManager.java:323)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.<init>(BlockManager.java:239)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:451)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:416)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1063)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1135)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> 12/12/05 16:43:48 INFO util.ExitUtil: Exiting with status 1
> 12/12/05 16:43:48 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at cs-10-20-193-228.cloud.cloudera.com/10.20.193.228
> ************************************************************/
> {code}
> The exception message says 
> {code}
> Please ensure that this node is one of the machines listed as an NN RPC address, or configure dfs.ha.namenode.id
> {code}
> I ran the recover command from a machine listed as an NN RPC:
> {code}
>   <property>
>     <name>dfs.namenode.rpc-address.ha-nn-uri.nn1</name>
>     <value>cs-10-20-193-228.cloud.cloudera.com:17020</value>
>   </property>
> {code}
> Setting dfs.ha.namenode.id allows me to proceed. If we always need to specify the dfs.ha.namenode.id, then we can edit the exception message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira