hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2093) 1073: Handle case where an entirely empty log is left during NN crash
Date Wed, 22 Jun 2011 00:18:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052965#comment-13052965
] 

Eli Collins commented on HDFS-2093:
-----------------------------------

Marking the log as corrupt makes sense since that enforces the variant that all valid logs
have the START_LOG_SEGMENT op. 

doTestCrashRecoveryEmptyLog assumes the cluster should not start, even if just one of the
dirs has a corrupted log, shouldn't the cluster start as long as only one of the in progress
logs was truncated?

Nit: "it has no transactions" should be indented.

Otherwise looks great.


> 1073: Handle case where an entirely empty log is left during NN crash
> ---------------------------------------------------------------------
>
>                 Key: HDFS-2093
>                 URL: https://issues.apache.org/jira/browse/HDFS-2093
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: hdfs-2093.txt, hdfs-2093.txt, hdfs-2093.txt
>
>
> In fault-testing the HDFS-1073 branch, I saw the following situation:
> - NN has two storage directories, but one is in failed state
> - NN starts to roll edits logs to edits_inprogress_5160285
> - NN then crashes
> - on restart, it detects the truncated log, but since it has 0 txns, it finalizes it
to the nonsense log name edits_5160285-5160284.
> - It then starts logs again at edits_inprogress_5160285.
> - After this point, no checkpoints or future NN startups succeed since there are two
logs starting with the same txid

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message