hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3521) Allow namenode to tolerate edit log corruption
Date Thu, 14 Jun 2012 19:49:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295266#comment-13295266

Aaron T. Myers commented on HDFS-3521:

bq. Recovery mode in branch-1 just let you stop reading or quit without saving. It does not
really "recover" anything. IMO, it does not seem useful. How about we revert it?

I find the option to stop reading exceptionally helpful, since we know that under certain
circumstances the NN will end up with a partial final transaction at the end of the edit log,
which will prevent the NN from running. I've already used it to recover a customer's corrupted
edits file that had a partial final transaction at the end of it because a disk on the NN
had filled up.

bq. Again, the edit log toleration can be disabled. I think there is no reason to revert the

Clearly not everyone agrees with the approach this patch took, myself included. I'm still
not even convinced that the option this patch introduces is an improvement, even if it can
be disabled. Let's please try to find a way to address whatever concerns you have that's agreeable
to everyone.
> Allow namenode to tolerate edit log corruption
> ----------------------------------------------
>                 Key: HDFS-3521
>                 URL: https://issues.apache.org/jira/browse/HDFS-3521
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 1.2.0
>         Attachments: h3521_20120610_b-1.patch, h3521_20120611_b-1.0.patch, h3521_20120611_b-1.patch
> HDFS-3479 adds checking for edit log corruption. It uses a fixed UNCHECKED_REGION_LENGTH
(=PREALLOCATION_LENGTH) so that the bytes at the end within the length is not checked.  Instead
of not checking the bytes, we should check everything and allow toleration.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message