Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2989510A30 for ; Mon, 23 Sep 2013 14:43:24 +0000 (UTC) Received: (qmail 4721 invoked by uid 500); 23 Sep 2013 14:43:11 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 4613 invoked by uid 500); 23 Sep 2013 14:43:11 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 4606 invoked by uid 99); 23 Sep 2013 14:43:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Sep 2013 14:43:10 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of manickam.p@outlook.com designates 65.54.190.95 as permitted sender) Received: from [65.54.190.95] (HELO bay0-omc2-s20.bay0.hotmail.com) (65.54.190.95) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Sep 2013 14:43:06 +0000 Received: from BAY176-W44 ([65.54.190.125]) by bay0-omc2-s20.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 23 Sep 2013 07:42:44 -0700 X-TMN: [7hn6KC9B+A2OpignrxkT79y2lvX4Qdze] X-Originating-Email: [manickam.p@outlook.com] Message-ID: Content-Type: multipart/alternative; boundary="_81a5cc83-3098-4af1-ac60-3d445cfe626a_" From: Manickam P To: "user@hadoop.apache.org" Subject: RE: Error while configuring HDFS fedration Date: Mon, 23 Sep 2013 20:12:44 +0530 Importance: Normal In-Reply-To: <831EE2E99FE24048B16EDA267FA10B6D2197197E@MTLDAG01.mtl.com> References: ,<831EE2E99FE24048B16EDA267FA10B6D2197197E@MTLDAG01.mtl.com> MIME-Version: 1.0 X-OriginalArrivalTime: 23 Sep 2013 14:42:44.0540 (UTC) FILETIME=[29F72FC0:01CEB86B] X-Virus-Checked: Checked by ClamAV on apache.org --_81a5cc83-3098-4af1-ac60-3d445cfe626a_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi=2C I followed your steps. That bind error got resolved but still i'm getting t= he second exception. I've given the complete stack below.=20 2013-09-23 10:26:01=2C887 INFO org.mortbay.log: Stopped SelectChannelConnec= tor@lab2-hadoop2-vm1.eng.com:50070 2013-09-23 10:26:01=2C988 INFO org.apache.hadoop.metrics2.impl.MetricsSyste= mImpl: Stopping NameNode metrics system... 2013-09-23 10:26:01=2C989 INFO org.apache.hadoop.metrics2.impl.MetricsSyste= mImpl: NameNode metrics system stopped. 2013-09-23 10:26:01=2C990 INFO org.apache.hadoop.metrics2.impl.MetricsSyste= mImpl: NameNode metrics system shutdown complete. 2013-09-23 10:26:01=2C991 FATAL org.apache.hadoop.hdfs.server.namenode.Name= Node: Exception in namenode join org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Director= y /home/lab/hadoop-2.1.0-beta/tmp/dfs/name is in an inconsistent state: sto= rage directory does not exist or is not accessible. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FS= Image.java:292) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead= (FSImage.java:200) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNa= mesystem.java:777) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSN= amesystem.java:558) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameN= ode.java:418) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.= java:466) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java= :659) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java= :644) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameN= ode.java:1221) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1= 287) 2013-09-23 10:26:02=2C001 INFO org.apache.hadoop.util.ExitUtil: Exiting wit= h status 1 2013-09-23 10:26:02=2C018 INFO org.apache.hadoop.hdfs.server.namenode.NameN= ode: SHUTDOWN_MSG:=20 Thanks=2C Manickam P From: eladi@mellanox.com To: user@hadoop.apache.org Subject: RE: Error while configuring HDFS fedration Date: Mon=2C 23 Sep 2013 14:05:47 +0000 =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= Ports in use may result from actual processes using them=2C or just ghost p= rocesses. The second error may be caused by inconsistent permissions on dif= ferent nodes=2C=0A= and/or a format is needed on DFS.=0A= =0A= I suggest the following:=0A= =0A= 1. =0A= sbin/stop-dfs.sh && sbin/stop-yarn.sh=0A= 2. =0A= sudo killall java=0A= (on all nodes)=0A= 3. =0A= sudo chmod =96R 755 /home/lab/hadoop-2.1.0-beta/tmp/dfs=0A= (on all nodes)=0A= 4. =0A= sudo rm =96rf /home/lab/hadoop-2.1.0-beta/tmp/dfs/*=0A= (on all nodes)=0A= 5. =0A= bin/hdfs namenode =96format =96force=0A= =0A= 6. =0A= sbin/start-dfs.sh && sbin/start-yarn.sh=0A= =0A= Then see if you get that error again.=0A= =0A= =0A= =0A= From: Manickam P [mailto:manickam.p@outlook.com]=0A= =0A= Sent: Monday=2C September 23=2C 2013 4:44 PM =0A= To: user@hadoop.apache.org =0A= Subject: Error while configuring HDFS fedration=0A= =0A= =0A= =0A= =0A= Guys=2C =0A= =0A= I'm trying to configure HDFS federation with 2.1.0 beta version. I am havin= g 3 machines in that i want to have two name nodes and one data node.=0A= =0A= =0A= I have done the other thing like password less ssh and host entries properl= y. when i start the cluster i'm getting the below error.=0A= =0A= =0A= In node one i'm getting this error.=20 =0A= java.net.BindException: Port in use: lab-hadoop.eng.com:50070 =0A= =0A= In another node i'm getting this error.=0A= =0A= org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Director= y /home/lab/hadoop-2.1.0-beta/tmp/dfs/name is in an inconsistent state: sto= rage directory does not exist or is not accessible. =0A= =0A= My core-site xml has the below.=20 =0A= =0A= =0A= fs.default.name =0A= hdfs://10.101.89.68:9000 =0A= =0A= =0A= hadoop.tmp.dir =0A= /home/lab/hadoop-2.1.0-beta/tmp =0A= =0A= =0A= =0A= My hdfs-site xml has the below. =0A= =0A= =0A= dfs.replication =0A= 2 =0A= =0A= =0A= dfs.permissions =0A= false =0A= =0A= =0A= dfs.federation.nameservices =0A= ns1=2Cns2 =0A= =0A= =0A= dfs.namenode.rpc-address.ns1 =0A= 10.101.89.68:9001 =0A= =0A= =0A= dfs.namenode.http-address.ns1 =0A= 10.101.89.68:50070 =0A= =0A= =0A= dfs.namenode.secondary.http-address.ns1 =0A= 10.101.89.68:50090 =0A= =0A= =0A= dfs.namenode.rpc-address.ns2 =0A= 10.101.89.69:9001 =0A= =0A= =0A= dfs.namenode.http-address.ns2 =0A= 10.101.89.69:50070 =0A= =0A= =0A= dfs.namenode.secondary.http-address.ns2 =0A= 10.101.89.69:50090 =0A= =0A= =0A= =0A= Please help me to fix this error.=20 =0A= =0A= =0A= Thanks=2C =0A= Manickam P =0A= =0A= =0A= =0A= = --_81a5cc83-3098-4af1-ac60-3d445cfe626a_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Hi=2C

I followed your steps. That bind error got resolved but stil= l i'm getting the second exception. I've given the complete stack below.
2013-09-23 10:26:01=2C887 INFO org= .mortbay.log: Stopped SelectChannelConnector@lab2-hadoop2-vm1.eng.com:50070=
2013-09-23 10:26:01=2C988 INFO org.apache.hadoop.metrics2.impl= .MetricsSystemImpl: Stopping NameNode metrics system...
2013-= 09-23 10:26:01=2C989 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl= : NameNode metrics system stopped.
2013-09-23 10:26:01=2C990 IN= FO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics syst= em shutdown complete.
<= font style=3D"" color=3D"#8C0095">2013-09-23 10:26:01=2C991 FATAL org.apach= e.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Dire= ctory /home/lab/hadoop-2.1.0-beta/tmp/dfs/name is in an inconsistent state:= storage directory does not exist or is not accessible.
 = =3B =3B =3B at org.apache.hadoop.hdfs.server.namenode.FSImage.recov= erStorageDirs(FSImage.java:292) =3B =3B =3B at org= .apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.j= ava:200)
 =3B =3B =3B at org.apache.hadoop.hdfs.ser= ver.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:777)
&n= bsp=3B =3B =3B at org.apache.hadoop.hdfs.server.namenode.FSNamesyst= em.loadFromDisk(FSNamesystem.java:558)
 =3B =3B =3B= at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode= .java:418)
 =3B =3B =3B at org.apache.hadoop.hdfs.= server.namenode.NameNode.initialize(NameNode.java:466)
 =3B=  =3B =3B at org.apache.hadoop.hdfs.server.namenode.NameNode.<=3Bi= nit>=3B(NameNode.java:659)
<= /font> =3B =3B =3B at org.ap= ache.hadoop.hdfs.server.namenode.NameNode.<=3Binit>=3B(NameNode.java:64= 4)
 =3B =3B =3B at org.apache.hadoop.hdfs.server.na= menode.NameNode.createNameNode(NameNode.java:1221)
 =3B&nbs= p=3B =3B at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNo= de.java:1287)
2013-09-23 10:26:02=2C001 INFO org.apache.hadoop.= util.ExitUtil: Exiting with status 1
2013-09-23 10:26:02=2C018 = INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: =

<= br>Thanks=2C
Manickam P


From: eladi@= mellanox.com
To: user@hadoop.apache.org
Subject: RE: Error while conf= iguring HDFS fedration
Date: Mon=2C 23 Sep 2013 14:05:47 +0000

= =0A= =0A= =0A= =0A= =0A= =0A= =0A=
=0A=

Ports= in use may result from actual processes using them=2C or just ghost proces= ses. The second error may be caused by inconsistent permissions on differen= t nodes=2C=0A= and/or a format is needed on DFS.

=0A=

 = =3B

=0A=

I sug= gest the following:

=0A=

 = =3B

=0A=

1. =3B =3B =3B =3B&nb= sp=3B =3B=0A= sbin/stop-dfs.sh &=3B&=3B sbin/stop-yarn.sh

= =0A=

2. =3B =3B =3B =3B=  =3B =3B=0A= sudo killall java=0A= (on all nodes)

=0A=

3. =3B =3B =3B =3B&nb= sp=3B =3B=0A= sudo chmod =96R 755 /home/lab/hadoop-2.1.0-beta/tmp/dfs=0A= (on all nodes)

=0A=

4. =3B =3B =3B =3B&nb= sp=3B =3B=0A= sudo rm =96rf /home/lab/hadoop-2.1.0-beta/tmp/dfs/*=0A= (on all nodes)

=0A=

5. =3B =3B =3B =3B&nb= sp=3B =3B=0A= bin/hdfs namenode =96format =96force=0A=

=0A=

6. =3B =3B =3B =3B&nb= sp=3B =3B=0A= sbin/start-dfs.sh &=3B&=3B sbin/start-yarn.sh=0A=

 = =3B

=0A=

Then = see if you get that error again.

=0A=

 = =3B

=0A=
=0A=
=0A=

From: Manickam P [mailto:manickam.p@outlook.com]=0A=
=0A= Sent: Monday=2C September 23=2C 2013 4:44 PM
=0A= To: user@hadoop.apache.org
=0A= Subject: Error while configuring HDFS fedration

=0A=
=0A=
=0A=

 =3B

=0A=
=0A=

Guys=2C
=0A=
=0A= I'm trying to configure HDFS federation with 2.1.0 beta version. I am havin= g 3 machines in that i want to have two name nodes and one data node.=0A=
=0A=
=0A= I have done the other thing like password less ssh and host entries properl= y. when i start the cluster i'm getting the below error.=0A=
=0A=
=0A= In node one i'm getting this error.
=0A= java.net.BindException: Port in use: la= b-hadoop.eng.com:50070
=0A=
=0A=
In another node i'm getting this error.=0A=
=0A= org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Director= y /home/lab/hadoop-2.1.0-beta/tmp/dfs/name is in an inconsistent state: sto= rage directory does not exist or is not accessible.
=0A=
=0A=
My core-site xml has the below.
=0A= <=3Bconfiguration>=3B
=0A=  =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B <=3Bname>=3Bfs.default.name<=3B/name>=3B=0A=  =3B =3B =3B <=3Bvalue>=3Bhdfs://10.101.89.68:9000<=3B/va= lue>=3B
=0A=  =3B <=3B/property>=3B
=0A=  =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B <=3Bname>=3Bhadoop.tmp.dir<=3B/name>=3B=0A=  =3B =3B =3B <=3Bvalue>=3B/home/lab/hadoop-2.1.0-beta/tmp&l= t=3B/value>=3B
=0A=  =3B <=3B/property>=3B
=0A= <=3B/configuration>=3B
=0A=
=0A=
My hdfs-site xml has the below.
=0A= <=3Bconfiguration>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B <=3Bname>=3Bdfs.replication<=3B/name= >=3B
=0A=  =3B =3B =3B =3B <=3Bvalue>=3B2<=3B/value>=3B
= =0A=  =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B <=3Bname>=3Bdfs.permissions<=3B/name= >=3B
=0A=  =3B =3B =3B =3B <=3Bvalue>=3Bfalse<=3B/value>=3B=0A=  =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bname>=3Bdf= s.federation.nameservices<=3B/name>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bvalue>=3Bn= s1=2Cns2<=3B/value>=3B
=0A=  =3B =3B =3B <=3B/property>=3B
=0A=  =3B =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bname>=3Bdf= s.namenode.rpc-address.ns1<=3B/name>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bvalue>=3B1= 0.101.89.68:9001<=3B/value>=3B
=0A=  =3B =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B <=3Bname>=3Bdfs.namenode.http-address.ns1<= =3B/name>=3B
=0A=  =3B =3B =3B <=3Bvalue>=3B10.101.89.68:50070<=3B/value>= =3B
=0A=  =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bname>=3Bdf= s.namenode.secondary.http-address.ns1<=3B/name>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bvalue>=3B1= 0.101.89.68:50090<=3B/value>=3B
=0A=  =3B =3B =3B <=3B/property>=3B
=0A=  =3B =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bname>=3Bdf= s.namenode.rpc-address.ns2<=3B/name>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bvalue>=3B1= 0.101.89.69:9001<=3B/value>=3B
=0A=  =3B =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B <=3Bname>=3Bdfs.namenode.http-address.ns2<= =3B/name>=3B
=0A=  =3B =3B =3B <=3Bvalue>=3B10.101.89.69:50070<=3B/value>= =3B
=0A=  =3B =3B <=3B/property>=3B
=0A=  =3B =3B <=3Bproperty>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bname>=3Bdf= s.namenode.secondary.http-address.ns2<=3B/name>=3B
=0A=  =3B =3B =3B =3B =3B =3B =3B <=3Bvalue>=3B1= 0.101.89.69:50090<=3B/value>=3B
=0A=  =3B =3B =3B <=3B/property>=3B
=0A=  =3B<=3B/configuration>=3B
=0A=

=0A= Please help me to fix this error.
=0A=
=0A=
=0A= Thanks=2C
=0A= Manickam P
=0A=
=0A=

=0A=
=0A=
= --_81a5cc83-3098-4af1-ac60-3d445cfe626a_--