hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Thu, 28 Jun 2012 18:55:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403363#comment-13403363

Todd Lipcon commented on HDFS-3077:

bq. Quorum is a semantics on the client/writer side and not the server side policy. Hence
the protocol for journaling should be generic enough. So lets not call it QJournalProtocol
and make it generic, allowing other types of clients/writers.

I disagree with this statement. The commit protocol is strongly intertwined with the way in
which the server has to behave. For example, the "new epoch" command needs to provide back
certain information about the current state of the journals and previous paxos-style 'accepted'
decisions. Trying to shoehorn it into a generic protocol doesn't make much sense to me.

bq. If you see some functionality missing in 3092, lets discuss and add it there, instead
of copying code and changing it separately

3092's "log syncing" stuff doesn't fit with the recovery protocol needed for correct operation
in a quorum commit setting. 3092's method of the JNs "registering" with the NN doesn't make
sense either in this system, since group membership changes are not yet designed for and are
quite complex. So it's not a matter of adding functionality to 3092, it's a matter of removing
a lot of the functionality which just doesn't fit with this commit protocol.

bq. Also 3092 has been in development in open, in incremental fashion. I think we should follow
this, instead of attaching a big patch from github.

I made a best effort to do it in the open and incrementally, but didn't get any responses
from the community. See HDFS-3188 and HDFS-3189 for example, both of which I posted back in
April. I remember in the same discussions you referenced above that you said you'd take a
look at these in the spirit of incremental progress. I understand you got busy with other
things, but I wasn't going to stop working on the project in the meantime. So, work progressed
and now there's a more fully baked implementation here.

Don't be fooled by the big size of the patch - the majority of the lines of code are essentially
boiler-plate -- protobuf translators, simple code to start/stop RPC and HTTP servers, etc.
I don't think this is unreasonably large to review.
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, hdfs-3077.txt, qjournal-design.pdf, qjournal-design.pdf
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message