hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-2074) 1073: determine edit log validity by truly reading and validating transactions
Date Wed, 15 Jun 2011 01:04:47 GMT

     [ https://issues.apache.org/jira/browse/HDFS-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-2074:

    Attachment: hdfs-2074.txt

Attached patch does the following:
- refactor the reading of the FSEditLog header into a new function/class, so that log verification
can share more code with log application
- similarly push the construction of a checksumming stream into the Reader class
- move the old {{getValidLength}} implementation into the test code since it's no longer used
by the non-test code paths
- change FSImageTransactionalStorageInspector to base its decision on which log to recover
on how many valid txns are in the files

I'd like to move the Op.Reader, Op.LogHeader, and the validation code into a new class, but
rather than combine the refactoring with this, I think it should be another JIRA.

> 1073: determine edit log validity by truly reading and validating transactions
> ------------------------------------------------------------------------------
>                 Key: HDFS-2074
>                 URL: https://issues.apache.org/jira/browse/HDFS-2074
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>         Attachments: hdfs-2074.txt
> HDFS-2003 separated the deserialization/reading of log records from the application of
those records to a namesystem. This means that we can now read through an edit log in order
to determine how many valid transactions are actually stored within. This is an improvement
on what the 1073 branch currently does, which is to simply look at how many bytes come before
the 0xFFFF... trailer at the end of the file.
> The next step after this is to use these new functions so that, when the NN starts up
and finds "in-progress" files like "edits_2_inprogress", it will rename them to their finalized
name like "edits_2-30" based on how many transactions are truly stored within. This will simplify
logic elsewhere.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message