hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject Re: Best practices to recover from Corrupt Namenode
Date Thu, 19 Jan 2012 07:19:42 GMT
Hi everyone,
Any ideas on how to tackle this kind of situation.

Thanks,
Praveenesh

On Tue, Jan 17, 2012 at 1:02 PM, praveenesh kumar <praveenesh@gmail.com>wrote:

> I have a replication factor of 2, because of the reason that I can not
> afford 3 replicas on my cluster.
> fsck output was saying block replicas missing for some files that was
> making Namenode is corrupt
> I don't have the output with me. but issue was block replicas were
> missing. How can we tackle that ?
>
> Is their an internal mechanism of creating new blocks, if they were found
> missing / some kind of refresh command  or something ?
>
>
> Thanks,
> Praveenesh
>
> On Tue, Jan 17, 2012 at 12:48 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> You ran into a corrupt files issue, not a namenode corruption (which
>> generally refers to the fsimage or edits getting corrupted).
>>
>> Did your files not have adequate replication that they could not
>> withstand the loss of one DN's disk? What exactly did fsck output? Did all
>> block replicas go missing for your files?
>>
>> On 17-Jan-2012, at 12:08 PM, praveenesh kumar wrote:
>>
>> > Hi guys,
>> >
>> > I just faced a weird situation, in which one of my hard disks on DN went
>> > down.
>> > Due to which when I restarted namenode, some of the blocks went missing
>> and
>> > it was saying my namenode is CORRUPT and in safe mode, which doesn't
>> allow
>> > you to add or delete any files on HDFS.
>> >
>> > I know , we can close the safe mode part.
>> > Problem is how to deal with Corrupt Namenode problem in this case --
>> Best
>> > practices.
>> >
>> > In my case, I was lucky that all missing blocks were that of the
>> Outputs of
>> > my M/R codes I ran previously.
>> > So I just deleted all those files with the missing blocks from HDFS to
>> come
>> > from CORRUPT --> HEALTHY state.
>> >
>> > But had it be for the large input data files , it won't be a good
>> solution
>> > in that case to delete those files.
>> >
>> > So I wanted to know what should be the best practices to deal with above
>> > kind of problems to go from CORRUPT NAMENODE --> HEALTHY NAMENODE?
>> >
>> > Thanks,
>> > Praveenesh
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message