hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1799) Refactor log rolling and filename management out of FSEditLog
Date Tue, 03 May 2011 01:39:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027977#comment-13027977

Todd Lipcon commented on HDFS-1799:

I agree that your design avoids introducing a new class by collapsing the file management
(ie "log lifecycle") code into EditLogOutputStream. However, I don't feel like "fewer code
changes" or "fewer classes" are particularly appropriate design goals. For me, the goal of
the design is to clearly separate the concerns of the different classes:
- *FSEditLog* - manages edit logs for the entire namenode, handles coordinate group commit
and error handling, exposes a single object to the rest of the NN.
- *JournalManager* - handles the log lifecycle and file (or ledger) management for each individual
* *EditLogOutputStream* - a direct parallel of OutputStream plus the minimal APIs necessary
for syncing

The nice thing about this design is that EditLogOutputStream needs no conception of StorageDirectories,
for example. In fact, it is barely coupled to HDFS at all with the exception of using a few
static constants.

In contrast, your design collapses layout management into the output stream, thus making EditLogFileOutputStream
depend on StorageDirectory, NNStorage, etc.

Lastly, I want point out that HDFS-1580 will require a class like this anyway (called {{Journal}}
in your design doc}}. Though this current patch doesn't address it, it will be a clear extension
of JournalManager to add the input-side calls, the purging calls, etc.

bq. Also, it uses the Prototype design pattern which I can't ever recall seeing used in the
I'm not sure what you're saying here - you're saying this is a good thing? :)

Clearly we have a difference of opinion on this design, but could you please indicate how
strong your objections are? i.e. are you -1ing this design or just proposing another option?
Given that I already have a bunch of work lined up (and blocked) behind this, I'd really like
to close this out in the next day or two.

> Refactor log rolling and filename management out of FSEditLog
> -------------------------------------------------------------
>                 Key: HDFS-1799
>                 URL: https://issues.apache.org/jira/browse/HDFS-1799
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>         Attachments: 0001-Added-state-management-to-FSEditLog.patch, 0002-Standardised-error-pattern.patch,
0003-Add-JournalFactory-and-move-divert-revert-out-of-FSE.patch, HDFS-1799-all.diff, hdfs-1799.txt,
hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt
> This is somewhat similar to HDFS-1580, but less ambitious. While that JIRA focuses on
pluggability, this task is simply the minimum needed for HDFS-1073:
> - Refactor the filename-specific code for rolling, diverting, and reverting log streams
out of FSEditLog into a new class
> - Clean up the related code in FSEditLog a bit
> Notably, this JIRA is going to temporarily break the BackupNode. I plan to circle back
on the BackupNode later on this branch.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message