Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD862D492 for ; Fri, 24 Aug 2012 16:38:35 +0000 (UTC) Received: (qmail 39743 invoked by uid 500); 24 Aug 2012 16:38:31 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 39497 invoked by uid 500); 24 Aug 2012 16:38:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 39490 invoked by uid 99); 24 Aug 2012 16:38:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2012 16:38:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lleung@ddn.com designates 74.62.46.229 as permitted sender) Received: from [74.62.46.229] (HELO mail.datadirectnet.com) (74.62.46.229) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2012 16:38:24 +0000 Received: from LAX-EX-CAHT1.datadirect.datadirectnet.com (10.8.103.81) by dermtp01.datadirect.datadirectnet.com (10.8.16.38) with Microsoft SMTP Server (TLS) id 8.3.192.1; Fri, 24 Aug 2012 09:38:02 -0700 Received: from LAX-EX-MB2.datadirect.datadirectnet.com ([fe80::96:2379:f1b2:ef2d]) by LAX-EX-CAHT1.datadirect.datadirectnet.com ([fe80::a883:f607:498a:b39%12]) with mapi id 14.02.0298.004; Fri, 24 Aug 2012 09:38:02 -0700 From: Leo Leung To: "user@hadoop.apache.org" Subject: RE: namenode not starting Thread-Topic: namenode not starting Thread-Index: AQHNgcoqjtwaszKyfkCjBvxhgvum+JdpBtMAgAACX4CAAFMRAP//yY9g Date: Fri, 24 Aug 2012 16:38:01 +0000 Message-ID: <1C40D33AEC9DCA40A06F4637B6E58ABF1D2B3BE5@LAX-EX-MB2.datadirect.datadirectnet.com> References: <858038922-1345793514-cardhu_decombobulator_blackberry.rim.net-342541276-@b25.c15.bise7.blackberry> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.40.19.65] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Abhay, Sounds like your namenode cannot find the metadata information it needs t= o start (the /current | image | *checppints etc) Basically, if you cannot locate that data locally or on your NFS Server, = your cluster is busted. But, let's us be optimistic about this.=20 There is a chance that your NFS Server is down or the path mounted is lost= . If it is NFS mounted (as you suggested) check that your host still have t= hat path mounted. (from the proper NFS Server) ( [shell] mount ) can tell.=20 * obviously if you originally mounted from foo:/mydata and now do bar:/m= ydata / you'll need to do some digging to find which NFS server it was w= riting to before. Failing to locate your namenode metadata (locally or on any of your NFS Se= rver) either because the NFS Server decided to become a blackhole, or some= removed it. And you don't have a backup of your namenode (tape or Secondary Namenode)= , =20 I think you are in a world of hurt there. In theory you can read the blocks on the DN and try to recover some of yo= ur data (assume not in CODEC / compressed) . Humm.. anyone knows about recovery services? (^^) -----Original Message----- From: H=E5vard Wahl Kongsg=E5rd [mailto:haavard.kongsgaard@gmail.com]=20 Sent: Friday, August 24, 2012 5:38 AM To: user@hadoop.apache.org Subject: Re: namenode not starting You should start with a reboot of the system. A lesson to everyone, this is exactly why you should have a secondary name = node (http://wiki.apache.org/hadoop/FAQ#What_is_the_purpose_of_the_secondar= y_name-node.3F) and run the namenode a mirrored RAID-5/10 disk. -H=E5vard On Fri, Aug 24, 2012 at 9:40 AM, Abhay Ratnaparkhi wrote: > Hello, > > I was using cluster for long time and not formatted the namenode. > I ran bin/stop-all.sh and bin/start-all.sh scripts only. > > I am using NFS for dfs.name.dir. > hadoop.tmp.dir is a /tmp directory. I've not restarted the OS. Any=20 > way to recover the data? > > Thanks, > Abhay > > > On Fri, Aug 24, 2012 at 1:01 PM, Bejoy KS wrote: >> >> Hi Abhay >> >> What is the value for hadoop.tmp.dir or dfs.name.dir . If it was set=20 >> to /tmp the contents would be deleted on a OS restart. You need to=20 >> change this location before you start your NN. >> Regards >> Bejoy KS >> >> Sent from handheld, please excuse typos. >> ________________________________ >> From: Abhay Ratnaparkhi >> Date: Fri, 24 Aug 2012 12:58:41 +0530 >> To: >> ReplyTo: user@hadoop.apache.org >> Subject: namenode not starting >> >> Hello, >> >> I had a running hadoop cluster. >> I restarted it and after that namenode is unable to start. I am=20 >> getting error saying that it's not formatted. :( Is it possible to=20 >> recover the data on HDFS? >> >> 2012-08-24 03:17:55,378 ERROR >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem=20 >> initialization failed. >> java.io.IOException: NameNode is not formatted. >> at >> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSI= mage.java:434) >> at >> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirecto= ry.java:110) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesys= tem.java:291) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.= java:270) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.= java:271) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java= :303) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:433= ) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:421= ) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.= java:1359) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:13 >> 68) >> 2012-08-24 03:17:55,380 ERROR >> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: >> NameNode is not formatted. >> at >> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSI= mage.java:434) >> at >> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirecto= ry.java:110) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesys= tem.java:291) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.= java:270) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.= java:271) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java= :303) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:433= ) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:421= ) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.= java:1359) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:13 >> 68) >> >> Regards, >> Abhay >> >> > -- H=E5vard Wahl Kongsg=E5rd Faculty of Medicine & Department of Mathematical Sciences NTNU http://havard.security-review.net/