hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-2093) 1073: Handle case where an entirely empty log is left during NN crash
Date Tue, 21 Jun 2011 05:05:47 GMT

     [ https://issues.apache.org/jira/browse/HDFS-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HDFS-2093:
------------------------------

          Component/s: name-node
          Description: 
In fault-testing the HDFS-1073 branch, I saw the following situation:
- NN has two storage directories, but one is in failed state
- NN starts to roll edits logs to edits_inprogress_5160285
- NN then crashes
- on restart, it detects the truncated log, but since it has 0 txns, it finalizes it to the
nonsense log name edits_5160285-5160284.
- It then starts logs again at edits_inprogress_5160285.
- After this point, no checkpoints or future NN startups succeed since there are two logs
starting with the same txid
    Affects Version/s: Edit log branch (HDFS-1073)
        Fix Version/s: Edit log branch (HDFS-1073)

> 1073: Handle case where an entirely empty log is left during NN crash
> ---------------------------------------------------------------------
>
>                 Key: HDFS-2093
>                 URL: https://issues.apache.org/jira/browse/HDFS-2093
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>
> In fault-testing the HDFS-1073 branch, I saw the following situation:
> - NN has two storage directories, but one is in failed state
> - NN starts to roll edits logs to edits_inprogress_5160285
> - NN then crashes
> - on restart, it detects the truncated log, but since it has 0 txns, it finalizes it
to the nonsense log name edits_5160285-5160284.
> - It then starts logs again at edits_inprogress_5160285.
> - After this point, no checkpoints or future NN startups succeed since there are two
logs starting with the same txid

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message