hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Recovering Corrupt FS Image on Amazon EBS
Date Mon, 05 Oct 2009 18:51:34 GMT

On 10/5/09 11:41 AM, "Malcolm Matalka" <mmatalka@millennialmedia.com> wrote:
> In the event of an error, we bring all the instances down.  I then tried
> to rerun the job (bringing all the instances back up and then attaching
> to EBS volumes) and the namenode will not come up.  The logfile gives
> the error at the bottom.  What are my options here to recover the file
> system?

Your edits file is corrupt.   You have some choices:

A) if you ran a secondary and ran it frequently, hacking the edits off at
the point of corruption will set the HDFS pretty close to the point of last

B) If you didn't run the secondary that often or you don't make that many
changes, you may just want to ignore the edits file and bring up the HDFS
without it.

C) Check your other directory--you -are- writing fsimage and edits to two
different dirs, right?  The other edits file may be healthier.

But I suspect you're looking at data loss. :(

> 2009-10-05 14:20:07,451 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode:
> java.lang.NumberFormatException: For input string: ""
>         at
> java.lang.NumberFormatException.forInputString(NumberFormatException.jav
> a:48)
>         at java.lang.Integer.parseInt(Integer.java:468)
>         at java.lang.Short.parseShort(Short.java:120)
>         at java.lang.Short.parseShort(Short.java:78)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.readShort(FSEditLog.jav
> a:1261)

View raw message