hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3540) Further improvement on recovery mode and edit log toleration in branch-1
Date Tue, 11 Sep 2012 03:58:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452697#comment-13452697
] 

Suresh Srinivas commented on HDFS-3540:
---------------------------------------

Error toleration is a useful feature when namenode continues to function without manual intervention.
This helps in HA setup. That said, I think recovery mode may be useful if a cluster admin
choses to make it manual.

>From what I understand based on previous comments, it allows an operator to continue with
corrupt editlog or abort. Not sure if abort is really a choice. What would one do after abort?
To that end, Nicholas, some of the information you print such as the editlog length, corruption
lenght and padding length etc. should be printed in recovery mode. This information will be
useful when one wants to continue ignoring the corrupt part of the editlog.

Given this, I would leave recovery mode alone and not remove it. Perhaps we should consider
printing more information during recovery to help an admin understand the state of the editlog.
Is that possible?
                
> Further improvement on recovery mode and edit log toleration in branch-1
> ------------------------------------------------------------------------
>
>                 Key: HDFS-3540
>                 URL: https://issues.apache.org/jira/browse/HDFS-3540
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 1.2.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>
> *Recovery Mode*: HDFS-3479 backported HDFS-3335 to branch-1.  However, the recovery mode
feature in branch-1 is dramatically different from the recovery mode in trunk since the edit
log implementations in these two branch are different.  For example, there is UNCHECKED_REGION_LENGTH
in branch-1 but not in trunk.
> *Edit Log Toleration*: HDFS-3521 added this feature to branch-1 to remedy UNCHECKED_REGION_LENGTH
and to tolerate edit log corruption.
> There are overlaps between these two features.  We study potential further improvement
in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message