hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Wed, 27 Jun 2012 02:48:45 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-3077:

    Attachment: hdfs-3077.txt

Here is an initial patch with the implementation of this design. It is not complete, but I'm
posting it here as it's already grown large, and I'd like to start the review process while
I continue to add test coverage and iron out various TODOs which are littered around the code.

As it is, the code can be run, and I can successfully start/restart NNs, fail JNs, etc, and
it mostly "works as advertised". There are known deficiencies which I'm working on addressing,
and these should mostly be marked by TODOs.

This patch is on top of the following:

ffcfc55 HDFS-3190. 1: Extract code to atomically write a file containing a long
025759c HDFS-3571. Add URL support to EditLogFileInputStream
707a309 HDFS-3572. Clean up init of SPNEGO
d84516f HDFS-3573. Change instantiation of journal managers to have NSInfo
f61dc7d HDFS-3574. Fix race in GetImageServlet where file is removed during header-setting
(and those on top of trunk).

I did not end up basing this on the HDFS-3092 branch as I originally planned, though there's
a bunch of code borrowed from the early work done on that branch by Brandon and Hari. I would
have liked to use the code exactly as it was, but the differences in design made it too difficult
to try to reconcile, and I ended up copy-pasting and modifying rather than patching against
that branch. (for example, all of the RPCs in this design go through an async queue in order
to do quorum writes)

Of course there will be follow-up work to create a test plan, add substantially more tests,
add docs, etc. But my hope is that, after review, we can commit this (and the prereq patches)
either to trunk or a branch and work from there to fix the remaining work items, test, etc.
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, hdfs-3077.txt, qjournal-design.pdf, qjournal-design.pdf
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message