hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place
Date Fri, 12 Aug 2011 06:24:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083970#comment-13083970
] 

Jitendra Nath Pandey commented on HDFS-2018:
--------------------------------------------

>I think it's a little easier to understand these APIs.
I don't think so. The patch adds EditLogReference as a new concept which actually complicates
the API. Because now, in FSEditLog and FSImage, where JournalManager is being used, the code
needs to use the new interface.
Ivan's patch uses EditLogInputStream which has been in use already.

 In HDFS-1580, we discussed to move towards modeling journals as a single continuous stream
of transactions, instead of segments. This patch doesn't remove segments but we don't want
to proliferate the use of segments. EditLogReference actually leaks the idea of segments in
different parts of the code, and with another name.


> treats log recovery as an explicit step at startup - it's good to make it explicit since
we need to not do recovery when a NN starts up in standby mode, for example.
  The recoverUnclosedLogs should not be implemented by a journal. A simpler JournalManager
interface is better, where it just knows what is in progress and what is finalized.

> "edits transfer mechanism"
 I think we should leave that outside the scope of this jira, because we are not even sure
whether it will be needed in future as pointed out by Ivan.

> I'm offering to do the work of modifying this to something I find more acceptable.
 It is important to identify and discuss the objections against the original patch. If you
upload a new patch with fundamental changes, it is hard for others to guess what were your
objections with the original patch. 




> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager
and the code for input streams is in the inspectors. This change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of transactions instead
of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message