hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Smith, Joshua D." <Joshua.Sm...@gd-ais.com>
Subject RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory
Date Tue, 27 Aug 2013 15:38:23 GMT
nn.domain is a place holder for the actual fully qualified hostname of my NameNode
snn.domain is a place holder for the actual fully qualified hostname of my StandbyNameNode.

Of course both the NameNode and the StandbyNameNode are running exactly the same software
with the same configuration since this is YARN. I'm not running and SecondaryName node.

The actual fully qualified hostnames are on another network and my customer is sensitive about
privacy, so that's why I didn't post the actual values.

So, I think I have the equivalent of nn1,nn2 do I not?

From: Azuryy Yu [mailto:azuryyyu@gmail.com]
Sent: Tuesday, August 27, 2013 11:32 AM
To: user@hadoop.apache.org
Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory


dfs.ha.namenodes.mycluster
nn.domain,snn.domain

it should be:
dfs.ha.namenodes.mycluster
nn1,nn2
On Aug 27, 2013 11:22 PM, "Smith, Joshua D." <Joshua.Smith@gd-ais.com<mailto:Joshua.Smith@gd-ais.com>>
wrote:
Harsh-

Here are all of the other values that I have configured.

hdfs-site.xml
-----------------

dfs.webhdfs.enabled
true

dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

dfs.ha.automatic-falover.enabled
true

ha.zookeeper.quorum
nn.domain:2181,snn.domain:2181,jt.domain:2181

dfs.journalnode.edits.dir
/opt/hdfs/data1/dfs/jn

dfs.namenode.shared.edits.dir
qjournal://nn.domain:8485;snn.domain:8485;jt.domain:8485/mycluster

dfs.nameservices
mycluster

dfs.ha.namenodes.mycluster
nn.domain,snn.domain

dfs.namenode.rpc-address.mycluster.nn1
nn.domain:8020

dfs.namenode.rpc-address.mycluster.nn2
snn.domain:8020

dfs.namenode.http-address.mycluster.nn1
nn.domain:50070

dfs.namenode.http-address.mycluster.nn2
snn.domain:50070

dfs.name.dir
/var/lib/hadoop-hdfs/cache/hdfs/dfs/name


core-site.xml
----------------
fs.trash.interval
1440

fs.trash.checkpoint.interval
1440

fs.defaultFS
hdfs://mycluster

dfs.datanode.data.dir
/hdfs/data1,/hdfs/data2,/hdfs/data3,/hdfs/data4,/hdfs/data5,/hdfs/data6,/hdfs/data7


mapred-site.xml
----------------------
mapreduce.framework.name<http://mapreduce.framework.name>
yarn

mapreduce.jobhistory.address
jt.domain:10020

mapreduce.jobhistory.webapp.address
jt.domain:19888


yarn-site.xml
-------------------
yarn.nodemanager.aux-service
mapreduce.shuffle

yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler

yarn.log-aggregation-enable
true

yarn.nodemanager.remote-app-log-dir
/var/log/hadoop-yarn/apps

yarn.application.classpath
$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib
/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$YARN_HOME/*,$YARN_HOME/lib/*

yarn.resourcemanager.resource-tracker.address
jt.domain:8031

yarn.resourcemanager.address
jt.domain:8032

yarn.resourcemanager.scheduler.address
jt.domain:8030

yarn.resourcemanager.admin.address
jt.domain:8033

yarn.reesourcemanager.webapp.address
jt.domain:8088


These are the only interesting entries in my HDFS log file when I try to start the NameNode
with "service hadoop-hdfs-namenode start".

WARN org.apache.hadoop.hdfs.server.common.Util: Path /var/lib/hadoop-hdfs/cache/hdfs/dfs/name
should be specified as a URI in configuration files. Please update hdfs configuration.
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory
(dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Configured NNs:
((there's a blank line here implying no configured NameNodes!))
ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
Java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA
is not enabled.
FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
Java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA
is not enabled.

I don't like the blank line for Configured NNs. Not sure why it's not finding them.

If I try the command "hdfs zkfc -formatZK" I get the following:
Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled
for this namenode.

-----Original Message-----
From: Smith, Joshua D. [mailto:Joshua.Smith@gd-ais.com<mailto:Joshua.Smith@gd-ais.com>]
Sent: Tuesday, August 27, 2013 7:17 AM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

Harsh-

Yes, I intend to use HA. That's what I'm trying to configure right now.

Unfortunately I cannot share my complete configuration files. They're on a disconnected network.
Are there any configuration items that you'd like me to post my settings for?

The deployment is CDH 4.3 on a brand new cluster. There are 3 master nodes (NameNode, StandbyNameNode,
JobTracker/ResourceManager) and 7 slave nodes. Each of the master nodes is configured to be
a Zookeeper node as well as a Journal node. The HA configuration that I'm striving toward
is the automatic fail-over with Zookeeper.

Does that help?
Josh

-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com<mailto:harsh@cloudera.com>]
Sent: Monday, August 26, 2013 6:11 PM
To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

It is not quite from your post, so a Q: Do you intend to use HA or not?

Can you share your complete core-site.xml and hdfs-site.xml along with a brief note on the
deployment?

On Tue, Aug 27, 2013 at 12:48 AM, Smith, Joshua D.
<Joshua.Smith@gd-ais.com<mailto:Joshua.Smith@gd-ais.com>> wrote:
> When I try to start HDFS I get an error in the log that says...
>
>
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
>
> java.io.IOException: Invalid configuration: a shared edits dir must
> not be specified if HA is not enabled.
>
>
>
> I have the following properties configured as per page 12 of the CDH4
> High Availability Guide...
>
> http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/la
> test/PDF/CDH4-High-Availability-Guide.pdf
>
>
>
> <property>
>
> <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>
> <value>nn.domain:8020</value>
>
> </property>
>
> <property>
>
> <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>
> <value>snn.domain:8020</value>
>
> </property>
>
>
>
> When I look at the Hadoop source code that generates the error message
> I can see that it's failing because it's looking for
> dfs.namenode.rpc-address without the suffix. I'm assuming that the
> suffix gets lopped off at some point before it gets pulled up and the
> property is checked for, so maybe I have the suffix wrong?
>
>
>
> In any case I can't get HDFS to start because it's looking for a
> property that I don't have in the truncated for and it doesn't seem to
> be finding the form of it with the suffix. Any assistance would be most appreciated.
>
>
>
> Thanks,
>
> Josh



--
Harsh J

Mime
View raw message