Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3EB8D10C17 for ; Tue, 18 Mar 2014 06:06:53 +0000 (UTC) Received: (qmail 44416 invoked by uid 500); 18 Mar 2014 06:06:44 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 43919 invoked by uid 500); 18 Mar 2014 06:06:43 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 43912 invoked by uid 99); 18 Mar 2014 06:06:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Mar 2014 06:06:42 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of azuryyyu@gmail.com designates 209.85.192.50 as permitted sender) Received: from [209.85.192.50] (HELO mail-qg0-f50.google.com) (209.85.192.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Mar 2014 06:06:37 +0000 Received: by mail-qg0-f50.google.com with SMTP id q108so19725832qgd.9 for ; Mon, 17 Mar 2014 23:06:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=mMUWGu9TgYvRDpI+Xn/Ig9v6mZLtnr2DjhyQQcWDSMc=; b=CDj/9ZWUl0Ep/L28t5feSkFwq4XLP0QiYZX8mUtqyxJhrKT2AGPYIpTdE4VCb6WVRm Lnp9HxUgRlkfOOSobTXfM1TEsxP8BYnmyuk93+4eOijPFjRA2PP/NoLwIU+UcMoo4OXB LDrWXsqGqeIAAX+537e8lR9tspcBAfQ34LMjC1JFX8lznoURl/zBIv7LSSbAJJXlb7eu r002Nsi2MqGLWJZPNgGGAoTX55KEQ27RD2cz9qpVVIXDwbCEuUTDmhSNg6MCGsmO0N0a jSkdrDtTPjNHRvry9gDPrprMl4OxSkEuUvgssfFXNWNI6c0dipeg7W2/g7qRBencFH78 dg2Q== MIME-Version: 1.0 X-Received: by 10.140.105.131 with SMTP id c3mr31532186qgf.29.1395122775566; Mon, 17 Mar 2014 23:06:15 -0700 (PDT) Received: by 10.140.100.175 with HTTP; Mon, 17 Mar 2014 23:06:15 -0700 (PDT) In-Reply-To: References: Date: Tue, 18 Mar 2014 14:06:15 +0800 Message-ID: Subject: Re: I am about to lose all my data please help From: Azuryy Yu To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a113a973874afb604f4db55bd X-Virus-Checked: Checked by ClamAV on apache.org --001a113a973874afb604f4db55bd Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I don't think this is the case, because there is; hadoop.tmp.dir /home/hadoop/project/hadoop-data On Tue, Mar 18, 2014 at 1:55 PM, Stanley Shi wrote: > one possible reason is that you didn't set the namenode working directory= , > by default it's in "/tmp" folder; and the "/tmp" folder might get deleted > by the OS without any notification. If this is the case, I am afraid you > have lost all your namenode data. > > * > dfs.name.dir > ${hadoop.tmp.dir}/dfs/name > Determines where on the local filesystem the DFS name node > should store the name table(fsimage). If this is a comma-delimited= list > of directories then the name table is replicated in all of the > directories, for redundancy. > * > > > Regards, > *Stanley Shi,* > > > > On Sun, Mar 16, 2014 at 5:29 PM, Mirko K=E4mpf wr= ote: > >> Hi, >> >> what is the location of the namenodes fsimage and editlogs? >> And how much memory has the NameNode. >> >> Did you work with a Secondary NameNode or a Standby NameNode for >> checkpointing? >> >> Where are your HDFS blocks located, are those still safe? >> >> With this information at hand, one might be able to fix your setup, but >> do not format the old namenode before >> all is working with a fresh one. >> >> Grab a copy of the maintainance guide: >> http://shop.oreilly.com/product/0636920025085.do?sortby=3DpublicationDat= e >> which helps solving such type of problems as well. >> >> Best wishes >> Mirko >> >> >> 2014-03-16 9:07 GMT+00:00 Fatih Haltas : >> >> Dear All, >>> >>> I have just restarted machines of my hadoop clusters. Now, I am trying >>> to restart hadoop clusters again, but getting error on namenode restart= . I >>> am afraid of loosing my data as it was properly running for more than 3 >>> months. Currently, I believe if I do namenode formatting, it will work >>> again, however, data will be lost. Is there anyway to solve this withou= t >>> losing the data. >>> >>> I will really appreciate any help. >>> >>> Thanks. >>> >>> >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> Here is the logs; >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> 2014-02-26 16:02:39,698 INFO >>> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: >>> /************************************************************ >>> STARTUP_MSG: Starting NameNode >>> STARTUP_MSG: host =3D ADUAE042-LAP-V/127.0.0.1 >>> STARTUP_MSG: args =3D [] >>> STARTUP_MSG: version =3D 1.0.4 >>> STARTUP_MSG: build =3D >>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r >>> 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012 >>> ************************************************************/ >>> 2014-02-26 16:02:40,005 INFO >>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from >>> hadoop-metrics2.properties >>> 2014-02-26 16:02:40,019 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source >>> MetricsSystem,sub=3DStats registered. >>> 2014-02-26 16:02:40,021 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot >>> period at 10 second(s). >>> 2014-02-26 16:02:40,021 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics sys= tem >>> started >>> 2014-02-26 16:02:40,169 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source = ugi >>> registered. >>> 2014-02-26 16:02:40,193 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source = jvm >>> registered. >>> 2014-02-26 16:02:40,194 INFO >>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source >>> NameNode registered. >>> 2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: VM type >>> =3D 64-bit >>> 2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: 2% max >>> memory =3D 17.77875 MB >>> 2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: capacity >>> =3D 2^21 =3D 2097152 entries >>> 2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: >>> recommended=3D2097152, actual=3D2097152 >>> 2014-02-26 16:02:40,273 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=3Dhadoop >>> 2014-02-26 16:02:40,273 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=3Dsuper= group >>> 2014-02-26 16:02:40,274 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> isPermissionEnabled=3Dtrue >>> 2014-02-26 16:02:40,279 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> dfs.block.invalidate.limit=3D100 >>> 2014-02-26 16:02:40,279 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> isAccessTokenEnabled=3Dfalse accessKeyUpdateInterval=3D0 min(s), >>> accessTokenLifetime=3D0 min(s) >>> 2014-02-26 16:02:40,724 INFO >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered >>> FSNamesystemStateMBean and NameNodeMXBean >>> 2014-02-26 16:02:40,749 INFO >>> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names >>> occuring more than 10 times >>> 2014-02-26 16:02:40,780 ERROR >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem >>> initialization failed. >>> java.io.IOException: NameNode is not formatted. >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FS= Image.java:330) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirect= ory.java:100) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesy= stem.java:388) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem= .java:362) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.jav= a:276) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:49= 6) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode= .java:1279) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288= ) >>> 2014-02-26 16:02:40,781 ERROR >>> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: >>> NameNode is not formatted. >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FS= Image.java:330) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirect= ory.java:100) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesy= stem.java:388) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem= .java:362) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.jav= a:276) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:49= 6) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode= .java:1279) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288= ) >>> >>> 2014-02-26 16:02:40,781 INFO >>> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: >>> /************************************************************ >>> SHUTDOWN_MSG: Shutting down NameNode at ADUAE042-LAP-V/127.0.0.1 >>> ************************************************************/ >>> >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D >>> Here is the core-site.xml >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D >>> >>> >>> >>> >>> >>> >>> >>> fs.default.name >>> -BLANKED >>> >>> >>> hadoop.tmp.dir >>> /home/hadoop/project/hadoop-data >>> >>> >>> >>> >>> >>> >> > --001a113a973874afb604f4db55bd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I don't think this is the case, because there is;
=
=A0 <property= >
=A0 = =A0 <name>hadoop.tmp.dir</name>
=A0 =A0 <valu= e>/home/hadoop/project/hadoop-data</value>
=A0 </property>
<= /div>


On Tue, Mar 18, 2014 at 1:55 PM, Stanley= Shi <sshi@gopivotal.com> wrote:
one possible reason is that you didn't set the namenod= e working directory, by default it's in "/tmp" folder; and th= e "/tmp" folder might get deleted by the OS without any notificat= ion. If this is the case, I am afraid you have lost all your namenode data.=
<property>
  <name>dfs.name.dir</name>
  <value>${hadoop.tmp.dir}/dfs/name</value>
  <description>Determines where on the local filesystem the DFS name =
node
      should store the name table(fsimage).  If this is a comma-delimited l=
ist
      of directories then the name table is replicated in all of the
      directories, for redundancy. </description>
</property>

Regards,
Stanley Shi,<= /div>


On Sun, Mar 16, 2014 at 5:29 PM, Mirko K= =E4mpf <mirko.kaempf@gmail.com> wrote:
Hi,

what is the= location of the namenodes fsimage and editlogs?
And how much= memory has the NameNode.

Did you work with a Secon= dary NameNode or a Standby NameNode for checkpointing?

Where are your HDFS blocks located, are those still safe?
With this information at hand, one might be able to fix your setup, = but do not format the old namenode before
all is working with a fr= esh one.

Grab a copy of the maintainance guide: htt= p://shop.oreilly.com/product/0636920025085.do?sortby=3DpublicationDate<= br>
which helps solving such type of problems as well.

Best wishes
Mirko


2014-03-16 9:07 GMT+00:00 Fatih Halta= s <fatih.haltas@nyu.edu>:

Dear All,

I have just restarted machines of my hadoop clusters. Now, I am trying to= restart hadoop clusters again, but getting error on namenode restart. I am= afraid of loosing my data as it was properly running for more than 3 month= s. Currently, I believe if I do namenode formatting, it will work again, ho= wever, data will be lost. Is there anyway to solve this without losing the = data.

I will really appreciate any help.

=
Thanks.


=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Here is the logs;
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
2014-02-26 16:02:39,698 INFO org.apache.hadoop.hdfs.server.namenode.NameNod= e: STARTUP_MSG:
/************************************************= ************
STARTUP_MSG: Starting NameNode
STARTUP_MSG= : =A0 host =3D ADUAE042-LAP-V/127.0.0.1
STARTUP_MSG: =A0 args =3D []
STARTUP_MSG: =A0 version =3D 1.= 0.4
STARTUP_MSG: =A0 build =3D https://svn.= apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; comp= iled by 'hortonfo' on Wed Oct =A03 05:13:58 UTC 2012
************************************************************/
2014-02-26 16:02:40,005 INFO org.apache.hadoop.metrics2.impl.MetricsConfi= g: loaded properties from hadoop-metrics2.properties
2014-02-26 1= 6:02:40,019 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBea= n for source MetricsSystem,sub=3DStats registered.
2014-02-26 16:02:40,021 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: Scheduled snapshot period at 10 second(s).
2014-02-26 1= 6:02:40,021 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNod= e metrics system started
2014-02-26 16:02:40,169 INFO org.apache.hadoop.metrics2.impl.MetricsSo= urceAdapter: MBean for source ugi registered.
2014-02-26 16:02:40= ,193 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for s= ource jvm registered.
2014-02-26 16:02:40,194 INFO org.apache.hadoop.metrics2.impl.MetricsSo= urceAdapter: MBean for source NameNode registered.
2014-02-26 16:= 02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: VM type =A0 =A0 =A0 =3D 64= -bit
2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: 2% max = memory =3D 17.77875 MB
2014-02-26 16:02:40,242 INFO org.apache.ha= doop.hdfs.util.GSet: capacity =A0 =A0 =A0=3D 2^21 =3D 2097152 entries
=
2014-02-26 16:02:40,242 INFO org.apache.hadoop.hdfs.util.GSet: recomme= nded=3D2097152, actual=3D2097152
2014-02-26 16:02:40,273 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: fsOwner=3Dhadoop
2014-02-26 16:02:40,273 INFO org.apa= che.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=3Dsupergroup
2014-02-26 16:02:40,274 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: isPermissionEnabled=3Dtrue
2014-02-26 16:02:40,279 IN= FO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidat= e.limit=3D100
2014-02-26 16:02:40,279 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: isAccessTokenEnabled=3Dfalse accessKeyUpdateInterval=3D0 min(s)= , accessTokenLifetime=3D0 min(s)
2014-02-26 16:02:40,724 INFO org= .apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemSt= ateMBean and NameNodeMXBean
2014-02-26 16:02:40,749 INFO org.apache.hadoop.hdfs.server.namenode.Na= meNode: Caching file names occuring more than 10 times
2014-02-26= 16:02:40,780 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FS= Namesystem initialization failed.
java.io.IOException: NameNode is not formatted.
=A0 =A0 =A0 = =A0 at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead= (FSImage.java:330)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.serv= er.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem= .initialize(FSNamesystem.java:388)
=A0 =A0 =A0 =A0 at org.apache.= hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362= )
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.initiali= ze(NameNode.java:276)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.s= erver.namenode.NameNode.<init>(NameNode.java:496)
=A0 =A0 = =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(N= ameNode.java:1279)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.mai= n(NameNode.java:1288)
2014-02-26 16:02:40,781 ERROR org.apache.ha= doop.hdfs.server.namenode.NameNode: java.io.IOException: NameNode is not fo= rmatted.
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.FSImage.reco= verTransitionRead(FSImage.java:330)
=A0 =A0 =A0 =A0 at org.apache= .hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)<= /div>
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.FSNames= ystem.initialize(FSNamesystem.java:388)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem= .<init>(FSNamesystem.java:362)
=A0 =A0 =A0 =A0 at org.apach= e.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
<= div>=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.<= init>(NameNode.java:496)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.cre= ateNameNode(NameNode.java:1279)
=A0 =A0 =A0 =A0 at org.apache.had= oop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

2014-02-26 16:02:40,781 INFO org.apache.hadoop.hdfs.server.namenod= e.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ADUAE042-LAP-V/127.0.0.1
**********************= **************************************/

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Here is the core-site.xml
=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D
=A0<?xml version=3D"1.0"?>
<?xml-stylesheet type=3D"text/xsl" href=3D"configuration= .xsl"?>

<!-- Put site-specific property overrides in this fi= le. -->

<configuration>
<pro= perty>
=A0 =A0 <name>fs.default.name</name>
=A0 =A0 <value>-BLANKED</value>
=A0 </propert= y>
=A0 <property>
=A0 =A0 <name>hadoop.t= mp.dir</name>
=A0 =A0 <value>/home/hadoop/project/had= oop-data</value>
=A0 </property>
</configuration>

<= /div>





--001a113a973874afb604f4db55bd--