hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Kelly (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place
Date Thu, 11 Aug 2011 23:04:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083785#comment-13083785
] 

Ivan Kelly commented on HDFS-2018:
----------------------------------

{quote}
no need for the more complicated caching in FileJournalManager, since we only scan the directory
once
{quote}
The caching in FileJournalManager is only for inprogress logs, which will be recovered the
first time they are read while not being written to. This is the only caching. I don't think
scanning a directory during load time is going to cause a significant performance hit.

{quote}
treats log recovery as an explicit step at startup - it's good to make it explicit since we
need to not do recovery when a NN starts up in standby mode, for example.
{quote}
Proper fencing is the correct way to handle this. 

{quote}
the EditLogReference interface will also make it easier to allow other types of journal managers
to participate in edits-transfer, I think.
{quote}
Other journal types shouldn't need any edit transfer mechanism, as anything that isn't a local
file will be shared storage.

Frankly, I don't see that this alternative approach brings enough to justify holding off this
for longer. This patch has been held up for 6 weeks for various reasons, and changing the
approach now is just going to delay it again. Jitendra and I have other patches which have
been held up waiting for this. It would have been nice if this alternative approach had been
proposed a month ago. 

> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
hdfs-2018-otherapi.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager
and the code for input streams is in the inspectors. This change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of transactions instead
of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message