hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Phelps <...@opendns.com>
Subject Re: Failed namenode restart, recovering from corrupt edits file?
Date Thu, 13 Jan 2011 01:45:01 GMT
On 1/12/11 1:36 PM, Adam Phelps wrote:
>> Also, there apparently is a way of healing a corrupt edits file using
>> your favorite hex editor. There is a thread here:
>> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/%3CAANLkTinBHmn1X8DLir-c4iBhjA9nh46tnS588CQCNv1h@mail.gmail.com%3E
>> <http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201010.mbox/<AANLkTinBHmn1X8DLir-c4iBhjA9nh46tnS588CQCNv1h@mail.gmail.com>>
> Thanks for the link. Manually editing the edits file is our current
> thought, a little understanding of the format should save us some pain.

I made a brief attempt at doing manual edits, but ended up taking a 
different approach and made some changes (which I revert after they'd 
been used) to FSEditLog.java.  I added a try/catch statement around the 
code that was generating the NullPointerException to catch and ignore 
that error, which appears to have allowed the namenode to come up 
successfully.  It looks like ~20 files were problematic, all apparently 
temporary output from a MR job.  At the moment everything seems to be 
running correctly, we'll see if that continues.

Todd - let me know if there's any information that would be useful to 
looking into this issue.

- Adam

View raw message