hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Wed, 27 Jun 2012 05:13:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401969#comment-13401969

Todd Lipcon commented on HDFS-3077:

bq. I want to understand how the changes can be reconciled with 3092. Currently BackupNode
is being updated to used the JournalService. 

The journal interface as exposed by the quorum-capable Journal Node looks different enough
from the BackupNode that I don't see any merit to combining the IPC protocols. It only muddies
the interaction, IMO. For example, the QJournalProtocol has the concept of a "journal ID"
so that each JournalNode can host journals for multiple namespaces at once, as well as the
epoch concept which makes no sense in a BackupNode scenario. If we wanted to extend HDFS to
act more like a true quorum-driven system (a la ZooKeeper) where each of the nodes maintains
a full namespace as equal peers, we'd need to do more work on the commit protocol (eg adding
an explicit "commit" RPC distinct from "journal"). That kind of change hasn't been proposed
anywhere that I'm aware of, so I didn't want to complicate this design by considering it.

At this point I would advocate removing the BackupNode entirely, as I don't know of a single
person using it for the last ~2 years since it was introduced. But, that's a separate discussion.

bq. Once this is done, we were planning to merge 3092 into trunk. How should we proceed to
merge 3077 and 3092 to trunk?

I used a bunch of the HDFS-3092 branch code and design in development of this JIRA, so I would
consider it to be "incorporated" into the 3077 branch already. So, I would advocate abandoning
the current 3092 branch as a stepping stone (server-side-only) along the way to the full solution
(server and client side implementation). Of course I'll make sure that Brandon and Hari are
given their due credit as co-authors of this patch.

bq. Is code review going to be based off of this or code changes into a branch on Apache Hadoop
code base?

I posted the git branch just for reference, since some contributors find it easier to do a
git pull rather than manually apply the patches locally for review. But the link above is
to the exact same code I've attached to the JIRA. Feel free to review by looking at the patch
or at the branch. Would it be helpful for me to make a branch in SVN and push the pre-review
patch series there for review instead of the external github? Let me know.
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, hdfs-3077.txt, qjournal-design.pdf, qjournal-design.pdf
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message