hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Kelly (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2003) Separate FSEditLog reading logic from editLog memory state building logic
Date Fri, 03 Jun 2011 15:51:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043420#comment-13043420

Ivan Kelly commented on HDFS-2003:

>From the perspective of FSEditLogLoader, an EOFException while reading the opcode (expected)
is treated the same as an EOFException in the middle of reading one of the ops (unexpected).
This seems unintentional, since a truncated log file is not normal, right?
This is intentional. We read from the journal which has the most edits available to read.
If this happens to be a journal with a truncated file, that journal is still the journal with
the most up to date logs. Do you disagree?

I'm thinking it might be slightly cleaner to make a new class like FSEditLogReader, which
is instantiated with an InputStream and logVersion. It would then expose just a readOp() method.
I think that will make it easier to mock up sources of edits in the future. What do you think?
Makes sense. 

> Separate FSEditLog reading logic from editLog memory state building logic
> -------------------------------------------------------------------------
>                 Key: HDFS-2003
>                 URL: https://issues.apache.org/jira/browse/HDFS-2003
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: Edit log branch (HDFS-1073)
>         Attachments: HDFS-2003.diff, HDFS-2003.diff
> Currently FSEditLogLoader has code for reading from an InputStream interleaved with code
which updates the FSNameSystem and FSDirectory. This makes it difficult to read an edit log
without having a whole load of other object initialised, which is problematic if you want
to do things like count how many transactions are in a file etc. 
> This patch separates the reading of the stream and the building of the memory state.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message