hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place
Date Thu, 11 Aug 2011 23:18:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083796#comment-13083796
] 

Todd Lipcon commented on HDFS-2018:
-----------------------------------

bq. The caching in FileJournalManager is only for inprogress logs, which will be recovered
the first time they are read while not being written to. This is the only caching. I don't
think scanning a directory during load time is going to cause a significant performance hit.

It's not a performance thing - I'm just pointing out that it's a bit ugly and harder to understand
this way. The validation/recovery at read time is also quite messy IMO - an API that purports
to "select a stream to read" should not have a side effect of renaming it, etc.

bq. Proper fencing is the correct way to handle this.

No, fencing is to keep two NNs from writing at the same time to a storage directory. This
is about one NN reading while the other writes -- in your design, the reader ends up finalizing
log segments without explicitly saying "do recovery". Recovery is an operation that should
only be done by the active (writer). So making it an explicit API makes more sense.

bq. Other journal types shouldn't need any edit transfer mechanism, as anything that isn't
a local file will be shared storage.

In the case of shared storage, the "edits transfer mechanism" is for the NN to return a pointer
to the file in its shared location (eg a pointer to the BK ledger, the file on NFS, whatever).
Having the NN provide this manifest to any standbys is still helpful, and makes files less
of a special case. I'm not proposing to attack this right now, but it does improve the current
code where we have "instanceof" checks.

bq. This patch has been held up for 6 weeks for various reasons, and changing the approach
now is just going to delay it again
It's a 100KB+ patch that both refactors and changes a lot of implementation. We separated
it into smaller patches, and those smaller patches have made steady progress over these 6
weeks. I'm offering to do the work of modifying this to something I find more acceptable.

bq. It would have been nice if this alternative approach had been proposed a month ago
I believe I did express my concern about these APIs both here and on HDFS-1580. The size of
the patch made it difficult to express exactly what the issues would be until the other refactors
had gotten done.

I well appreciate the frustration of having a patch take a long time to go in, but I'm trying
to work with you here...

> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
hdfs-2018-otherapi.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager
and the code for input streams is in the inspectors. This change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of transactions instead
of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message