hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uma Maheswara Rao G <mahesw...@huawei.com>
Subject RE: Cannot restart Namenode after disk full
Date Tue, 31 Jul 2012 12:55:18 GMT
Hi Mourad,

 I think you are hitting this issue HDFS-1594.
On disk full case there is a chance of getting the corruptions like this in our experience.
We are moving the namenode to safemode automatically when disk is filled.
This was committed only in hadoop-2/trunk and  not in hadoop-1 versions.

Regards,
Uma

________________________________________
From: mouradk [mouradk78@googlemail.com]
Sent: Tuesday, July 31, 2012 4:23 PM
To: general@hadoop.apache.org
Subject: Re: Cannot restart Namenode after disk full

Thanks for the advise Ryan, will certainly roll out the new releases in the near future. This
is my first post on the channel, thanks all for your support.

Mouradk
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 31 July 2012 at 03:13, Ryan Rawson wrote:

> And for gods sake please please please please PLEASE upgrade to a
> newer version of hadoop and hbase!
>
> That variant of Hadoop is broken for HBase -- you lost data for
> certain -- and that HBase version is excessively old. The HBase team
> is on the 3rd major release since the variant you are running. 0.90,
> 0.92 and now 0.94. All have significant performance, stability and
> other improvements over 0.20.
>
> No one should ever run HBase on 0.20.x. You require at least
> 0.20-branch-append.
>
> On Mon, Jul 30, 2012 at 11:22 AM, Adam Brown <adam@hortonworks.com (mailto:adam@hortonworks.com)>
wrote:
> > Can you send me your edit log file?
> >
> > adam@hortonworks.com (mailto:adam@hortonworks.com)
> >
> >
> >
> > On Mon, Jul 30, 2012 at 9:36 AM, mouradk <mouradk78@googlemail.com (mailto:mouradk78@googlemail.com)>
wrote:
> > > Hi Adam,
> > >
> > > Thanks for your prompt reply. I am not sure how to attempt to Restore from
SecondaryNameNode. When I restart hadoop, the NameNode shutdowns as per the log, but secondaryNameNode
is launched.
> > >
> > > $jps
> > > 23675 RunJar
> > > 23225 TaskTracker
> > > 23023 SecondaryNameNode
> > > 22886 DataNode
> > > 4985 GossipRouter
> > > 30870 WOBootstrap
> > > 24684 Jps
> > > 5887 WOBootstrap
> > > 23100 JobTracker
> > > 24460 WOBootstrap
> > > 5838 WOBootstrap
> > > 26648 WOBootstrap
> > >
> > >
> > > I have read on a few threads about repairing the edits file but I am afraid
I am not too sure how to attempt it.
> > >
> > > Many thanks,
> > >
> > > Mouradk
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >
> > >
> > > On Monday, 30 July 2012 at 17:27, Adam Brown wrote:
> > >
> > > > HI Mouradk,
> > > >
> > > > looks like your edit log is corrupt
> > > >
> > > > can you recover from a secondary namenode?
> > > >
> > > > -Adam
> > > >
> > > > On Mon, Jul 30, 2012 at 9:26 AM, mouradk <mouradk78@googlemail.com
(mailto:mouradk78@googlemail.com)> wrote:
> > > > > Dear all,
> > > > >
> > > > > We are running a hadoop 0.20.2 single node with hbase 0.20.4 and
cannot restart namenode after the disk got full. I have freed more space but cannot restart
Namenode and get the following error:
> > > > >
> > > > >
> > > > > TARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
-r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> > > > > ************************************************************/
> > > > > 2012-07-30 16:02:23,649 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=50001
> > > > > 2012-07-30 16:02:23,656 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
Namenode up at: localhost/127.0.0.1:50001
> > > > > 2012-07-30 16:02:23,659 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
> > > > > 2012-07-30 16:02:23,660 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
> > > > > 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
fsOwner=hadoop,hadoop
> > > > > 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
supergroup=supergroup
> > > > > 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=false
> > > > > 2012-07-30 16:02:23,721 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
> > > > > 2012-07-30 16:02:23,723 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
Registered FSNamesystemStatusMBean
> > > > > 2012-07-30 16:02:23,756 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files = 533
> > > > > 2012-07-30 16:02:23,833 INFO org.apache.hadoop.hdfs.server.common.Storage:
Number of files under construction = 2
> > > > > 2012-07-30 16:02:23,835 INFO org.apache.hadoop.hdfs.server.common.Storage:
Image file of size 55400 loaded in 0 seconds.
> > > > > 2012-07-30 16:02:23,844 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.NumberFormatException: For input string: "1343506"
> > > > > at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> > > > > at java.lang.Long.parseLong(Long.java:419)
> > > > > at java.lang.Long.parseLong(Long.java:468)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:775)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
> > > > > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
> > > > > at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
> > > > > at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
> > > > > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
> > > > > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
> > > > >
> > > > > 2012-07-30 16:02:23,845 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
SHUTDOWN_MSG:
> > > > >
> > > > >
> > > > > Your help is much appreciated!!
> > > > >
> > > > >
> > > > > Mouradk
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > >
> > > > >
> > > > > Mouradk
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > >
> > > > >
> > > > > Mouradk
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Adam Brown
> > > > Enablement Engineer
> > > > Hortonworks
> > > >
> > >
> > >
> >
> >
> >
> >
> > --
> > Adam Brown
> > Enablement Engineer
> > Hortonworks
> >
>
>
>
Mime
View raw message