Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6160E10602 for ; Tue, 27 Aug 2013 16:00:52 +0000 (UTC) Received: (qmail 83116 invoked by uid 500); 27 Aug 2013 16:00:46 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 82993 invoked by uid 500); 27 Aug 2013 16:00:45 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82986 invoked by uid 99); 27 Aug 2013 16:00:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Aug 2013 16:00:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of azuryyyu@gmail.com designates 209.85.220.182 as permitted sender) Received: from [209.85.220.182] (HELO mail-vc0-f182.google.com) (209.85.220.182) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Aug 2013 16:00:39 +0000 Received: by mail-vc0-f182.google.com with SMTP id hf12so3278559vcb.13 for ; Tue, 27 Aug 2013 09:00:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=pwqeG/IMSRNMufPpAtDByrcSDWMytXCEtx2HGaFSheQ=; b=RAehQy40mW33jhB1R0DhxunL+ujlpxlTelFqyhOMSrAXqbMUWvFpkq4Shz7WKpelUD TTHx/k9PPNmQpIslvjwV0TYZyaZFSndH7AZ9wbbdIZOupS14HFtygyBSdEC6o9uuFf3f OUo6ZwWV5c1apXT8r92213l0pxj3EkQ0YKs+iw0rOu3TGTBrYAxIP/h0nT5mkBZJHIQ1 tSiXTr8mltl8uPUKN2AQ5bMXv8ay6bG/eWF8wW3I5vOTx8MpzH9jmHGozP3cVs0Z7T4U u2hgx6tXCF7IxgzBQKtRhCs6s+BvqLqbvtkW9Mjhv7nikljuesQ5UHxIfA5kAmkaJNzX MkrQ== MIME-Version: 1.0 X-Received: by 10.52.164.201 with SMTP id ys9mr753337vdb.39.1377619218300; Tue, 27 Aug 2013 09:00:18 -0700 (PDT) Received: by 10.220.95.7 with HTTP; Tue, 27 Aug 2013 09:00:18 -0700 (PDT) Received: by 10.220.95.7 with HTTP; Tue, 27 Aug 2013 09:00:18 -0700 (PDT) In-Reply-To: <4FD323DAC825AA4882CB54C01B499C222E50DE@EADC-E-MABPRD13.ad.gd-ais.com> References: <4FD323DAC825AA4882CB54C01B499C222E5076@EADC-E-MABPRD13.ad.gd-ais.com> <4FD323DAC825AA4882CB54C01B499C222E50DE@EADC-E-MABPRD13.ad.gd-ais.com> Date: Wed, 28 Aug 2013 00:00:18 +0800 Message-ID: Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory From: Azuryy Yu To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11c2d7a824853204e4eff818 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2d7a824853204e4eff818 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable not yet. please correct it. On Aug 27, 2013 11:39 PM, "Smith, Joshua D." wrote: > nn.domain is a place holder for the actual fully qualified hostname of > my NameNode**** > > snn.domain is a place holder for the actual fully qualified hostname of m= y > StandbyNameNode.**** > > ** ** > > Of course both the NameNode and the StandbyNameNode are running exactly > the same software with the same configuration since this is YARN. I=92m n= ot > running and SecondaryName node.**** > > ** ** > > The actual fully qualified hostnames are on another network and my > customer is sensitive about privacy, so that=92s why I didn=92t post the = actual > values.**** > > ** ** > > So, I think I have the equivalent of nn1,nn2 do I not?**** > > ** ** > > *From:* Azuryy Yu [mailto:azuryyyu@gmail.com] > *Sent:* Tuesday, August 27, 2013 11:32 AM > *To:* user@hadoop.apache.org > *Subject:* RE: HDFS Startup Failure due to dfs.namenode.rpc-address and > Shared Edits Directory**** > > ** ** > > dfs.ha.namenodes.mycluster > nn.domain,snn.domain**** > > it should be: > dfs.ha.namenodes.mycluster > nn1,nn2**** > > On Aug 27, 2013 11:22 PM, "Smith, Joshua D." > wrote:**** > > Harsh- > > Here are all of the other values that I have configured. > > hdfs-site.xml > ----------------- > > dfs.webhdfs.enabled > true > > dfs.client.failover.proxy.provider.mycluster > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > dfs.ha.automatic-falover.enabled > true > > ha.zookeeper.quorum > nn.domain:2181,snn.domain:2181,jt.domain:2181 > > dfs.journalnode.edits.dir > /opt/hdfs/data1/dfs/jn > > dfs.namenode.shared.edits.dir > qjournal://nn.domain:8485;snn.domain:8485;jt.domain:8485/mycluster > > dfs.nameservices > mycluster > > dfs.ha.namenodes.mycluster > nn.domain,snn.domain > > dfs.namenode.rpc-address.mycluster.nn1 > nn.domain:8020 > > dfs.namenode.rpc-address.mycluster.nn2 > snn.domain:8020 > > dfs.namenode.http-address.mycluster.nn1 > nn.domain:50070 > > dfs.namenode.http-address.mycluster.nn2 > snn.domain:50070 > > dfs.name.dir > /var/lib/hadoop-hdfs/cache/hdfs/dfs/name > > > core-site.xml > ---------------- > fs.trash.interval > 1440 > > fs.trash.checkpoint.interval > 1440 > > fs.defaultFS > hdfs://mycluster > > dfs.datanode.data.dir > > /hdfs/data1,/hdfs/data2,/hdfs/data3,/hdfs/data4,/hdfs/data5,/hdfs/data6,/= hdfs/data7 > > > mapred-site.xml > ---------------------- > mapreduce.framework.name > yarn > > mapreduce.jobhistory.address > jt.domain:10020 > > mapreduce.jobhistory.webapp.address > jt.domain:19888 > > > yarn-site.xml > ------------------- > yarn.nodemanager.aux-service > mapreduce.shuffle > > yarn.nodemanager.aux-services.mapreduce.shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > yarn.log-aggregation-enable > true > > yarn.nodemanager.remote-app-log-dir > /var/log/hadoop-yarn/apps > > yarn.application.classpath > $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_= HDFS_HOME/*,$HADOOP_HDFS_HOME/lib > /*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$YARN_HOME/*,$YARN_HOM= E/lib/* > > yarn.resourcemanager.resource-tracker.address > jt.domain:8031 > > yarn.resourcemanager.address > jt.domain:8032 > > yarn.resourcemanager.scheduler.address > jt.domain:8030 > > yarn.resourcemanager.admin.address > jt.domain:8033 > > yarn.reesourcemanager.webapp.address > jt.domain:8088 > > > These are the only interesting entries in my HDFS log file when I try to > start the NameNode with "service hadoop-hdfs-namenode start". > > WARN org.apache.hadoop.hdfs.server.common.Util: Path > /var/lib/hadoop-hdfs/cache/hdfs/dfs/name should be specified as a URI in > configuration files. Please update hdfs configuration. > WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image > storage directory (dfs.namenode.name.dir) configured. Beware of data loss > due to lack of redundant storage directories! > INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: fal= se > WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Configured NNs: > ((there's a blank line here implying no configured NameNodes!)) > ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem > initialization failed. > Java.io.IOException: Invalid configuration: a shared edits dir must not b= e > specified if HA is not enabled. > FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in > namenode join > Java.io.IOException: Invalid configuration: a shared edits dir must not b= e > specified if HA is not enabled. > > I don't like the blank line for Configured NNs. Not sure why it's not > finding them. > > If I try the command "hdfs zkfc -formatZK" I get the following: > Exception in thread "main" > org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for > this namenode. > > -----Original Message----- > From: Smith, Joshua D. [mailto:Joshua.Smith@gd-ais.com] > Sent: Tuesday, August 27, 2013 7:17 AM > To: user@hadoop.apache.org > Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and > Shared Edits Directory > > Harsh- > > Yes, I intend to use HA. That's what I'm trying to configure right now. > > Unfortunately I cannot share my complete configuration files. They're on = a > disconnected network. Are there any configuration items that you'd like m= e > to post my settings for? > > The deployment is CDH 4.3 on a brand new cluster. There are 3 master node= s > (NameNode, StandbyNameNode, JobTracker/ResourceManager) and 7 slave nodes= . > Each of the master nodes is configured to be a Zookeeper node as well as = a > Journal node. The HA configuration that I'm striving toward is the > automatic fail-over with Zookeeper. > > Does that help? > Josh > > -----Original Message----- > From: Harsh J [mailto:harsh@cloudera.com] > Sent: Monday, August 26, 2013 6:11 PM > To: > Subject: Re: HDFS Startup Failure due to dfs.namenode.rpc-address and > Shared Edits Directory > > It is not quite from your post, so a Q: Do you intend to use HA or not? > > Can you share your complete core-site.xml and hdfs-site.xml along with a > brief note on the deployment? > > On Tue, Aug 27, 2013 at 12:48 AM, Smith, Joshua D. > wrote: > > When I try to start HDFS I get an error in the log that says... > > > > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem > > initialization failed. > > > > java.io.IOException: Invalid configuration: a shared edits dir must > > not be specified if HA is not enabled. > > > > > > > > I have the following properties configured as per page 12 of the CDH4 > > High Availability Guide... > > > > http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/la > > test/PDF/CDH4-High-Availability-Guide.pdf > > > > > > > > > > > > dfs.namenode.rpc-address.mycluster.nn1 > > > > nn.domain:8020 > > > > > > > > > > > > dfs.namenode.rpc-address.mycluster.nn2 > > > > snn.domain:8020 > > > > > > > > > > > > When I look at the Hadoop source code that generates the error message > > I can see that it's failing because it's looking for > > dfs.namenode.rpc-address without the suffix. I'm assuming that the > > suffix gets lopped off at some point before it gets pulled up and the > > property is checked for, so maybe I have the suffix wrong? > > > > > > > > In any case I can't get HDFS to start because it's looking for a > > property that I don't have in the truncated for and it doesn't seem to > > be finding the form of it with the suffix. Any assistance would be most > appreciated. > > > > > > > > Thanks, > > > > Josh > > > > -- > Harsh J**** > --001a11c2d7a824853204e4eff818 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable

not yet.

please correct it.

On Aug 27, 2013 11:39 PM, "Smith, Joshua D.= " <Joshua.Smith@gd-ais.c= om> wrote:

nn.domain is a place hold= er for the actual fully qualified hostname of my NameNode

snn.domain is a place hol= der for the actual fully qualified hostname of my StandbyNameNode.

=A0<= /p>

Of course both the NameNo= de and the StandbyNameNode are running exactly the same software with the s= ame configuration since this is YARN. I=92m not running and SecondaryName node.

=A0<= /p>

The actual fully qualifie= d hostnames are on another network and my customer is sensitive about priva= cy, so that=92s why I didn=92t post the actual values.=

=A0<= /p>

So, I think I have the eq= uivalent of nn1,nn2 do I not?

=A0<= /p>

From: Azuryy Y= u [mailto:azuryyyu@= gmail.com]
Sent: Tuesday, August 27, 2013 11:32 AM
To: user= @hadoop.apache.org
Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address an= d Shared Edits Directory

=A0

dfs.ha.namenodes.mycluster
nn.domain,snn.domain

it should be:
dfs.ha.namenodes.mycluster
nn1,nn2

On Aug 27, 2013 11:22 PM, "Smith, Joshua D.&quo= t; <Joshua.= Smith@gd-ais.com> wrote:

Harsh-

Here are all of the other values that I have configured.

hdfs-site.xml
-----------------

dfs.webhdfs.enabled
true

dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.automatic-falover.enabled
true

ha.zookeeper.quorum
nn.domain:2181,snn.domain:2181,jt.domain:2181

dfs.journalnode.edits.dir
/opt/hdfs/data1/dfs/jn

dfs.namenode.shared.edits.dir
qjournal://nn.domain:8485;snn.domain:8485;jt.domain:8485/mycluster

dfs.nameservices
mycluster

dfs.ha.namenodes.mycluster
nn.domain,snn.domain

dfs.namenode.rpc-address.mycluster.nn1
nn.domain:8020

dfs.namenode.rpc-address.mycluster.nn2
snn.domain:8020

dfs.namenode.http-address.mycluster.nn1
nn.domain:50070

dfs.namenode.http-address.mycluster.nn2
snn.domain:50070

dfs.name.dir
/var/lib/hadoop-hdfs/cache/hdfs/dfs/name


core-site.xml
----------------
fs.trash.interval
1440

fs.trash.checkpoint.interval
1440

fs.defaultFS
hdfs://mycluster

dfs.datanode.data.dir
/hdfs/data1,/hdfs/data2,/hdfs/data3,/hdfs/data4,/hdfs/data5,/hdfs/data6,/hd= fs/data7


mapred-site.xml
----------------------
mapreduce.fra= mework.name
yarn

mapreduce.jobhistory.address
jt.domain:10020

mapreduce.jobhistory.webapp.address
jt.domain:19888


yarn-site.xml
-------------------
yarn.nodemanager.aux-service
mapreduce.shuffle

yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler

yarn.log-aggregation-enable
true

yarn.nodemanager.remote-app-log-dir
/var/log/hadoop-yarn/apps

yarn.application.classpath
$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HD= FS_HOME/*,$HADOOP_HDFS_HOME/lib /*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOM= E/lib/*,$YARN_HOME/*,$YARN_HOME/lib/*

yarn.resourcemanager.resource-tracker.address
jt.domain:8031

yarn.resourcemanager.address
jt.domain:8032

yarn.resourcemanager.scheduler.address
jt.domain:8030

yarn.resourcemanager.admin.address
jt.domain:8033

yarn.reesourcemanager.webapp.address
jt.domain:8088


These are the only interesting entries in my HDFS log file when I try to st= art the NameNode with "service hadoop-hdfs-namenode start".

WARN org.apache.hadoop.hdfs.server.common.Util: Path /var/lib/hadoop-hdfs/c= ache/hdfs/dfs/name should be specified as a URI in configuration files. Ple= ase update hdfs configuration.
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image st= orage directory (dfs.namenode.name.dir) configured. Beware of data loss due= to lack of redundant storage directories!
INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false=
WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Configured NNs: ((there's a blank line here implying no configured NameNodes!))
ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem ini= tialization failed.
Java.io.IOException: Invalid configuration: a shared edits dir must not be = specified if HA is not enabled.
FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenod= e join
Java.io.IOException: Invalid configuration: a shared edits dir must not be = specified if HA is not enabled.

I don't like the blank line for Configured NNs. Not sure why it's n= ot finding them.

If I try the command "hdfs zkfc -formatZK" I get the following: Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumen= tException: HA is not enabled for this namenode.

-----Original Message-----
From: Smith, Joshua D. [mailto:Joshua.Smith@gd-ais.com]
Sent: Tuesday, August 27, 2013 7:17 AM
To: user@hadoop= .apache.org
Subject: RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Share= d Edits Directory

Harsh-

Yes, I intend to use HA. That's what I'm trying to configure right = now.

Unfortunately I cannot share my complete configuration files. They're o= n a disconnected network. Are there any configuration items that you'd = like me to post my settings for?

The deployment is CDH 4.3 on a brand new cluster. There are 3 master nodes = (NameNode, StandbyNameNode, JobTracker/ResourceManager) and 7 slave nodes. = Each of the master nodes is configured to be a Zookeeper node as well as a = Journal node. The HA configuration that I'm striving toward is the automatic fail-over with Zookeeper.
Does that help?
Josh

-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Monday, August 26, 2013 6:11 PM
To: <user@ha= doop.apache.org>
Subject: Re: HDFS Startup Failure due to dfs.namenode.rpc-address and Share= d Edits Directory

It is not quite from your post, so a Q: Do you intend to use HA or not?

Can you share your complete core-site.xml and hdfs-site.xml along with a br= ief note on the deployment?

On Tue, Aug 27, 2013 at 12:48 AM, Smith, Joshua D.
<Joshua.Smi= th@gd-ais.com> wrote:
> When I try to start HDFS I get an error in the log that says...
>
>
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
>
> java.io.IOException: Invalid configuration: a shared edits dir must > not be specified if HA is not enabled.
>
>
>
> I have the following properties configured as per page 12 of the CDH4<= br> > High Availability Guide...
>
> http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/la<= br> > test/PDF/CDH4-High-Availability-Guide.pdf
>
>
>
> <property>
>
> <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>
> <value>nn.domain:8020</value>
>
> </property>
>
> <property>
>
> <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>
> <value>snn.domain:8020</value>
>
> </property>
>
>
>
> When I look at the Hadoop source code that generates the error message=
> I can see that it's failing because it's looking for
> dfs.namenode.rpc-address without the suffix. I'm assuming that the=
> suffix gets lopped off at some point before it gets pulled up and the<= br> > property is checked for, so maybe I have the suffix wrong?
>
>
>
> In any case I can't get HDFS to start because it's looking for= a
> property that I don't have in the truncated for and it doesn't= seem to
> be finding the form of it with the suffix. Any assistance would be mos= t appreciated.
>
>
>
> Thanks,
>
> Josh



--
Harsh J

--001a11c2d7a824853204e4eff818--