hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Tue, 03 Apr 2012 21:48:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245785#comment-13245785

Bikas Saha commented on HDFS-3077:

I have a question around syncing journal nodes and quorum based writes. There will always
be a case that a lost journal node comes back up and is syncing its state - the extreme example
of which is replacement of a broken journal node with a new node.
While it is doing this, will it be part of the quorum when a quorum number of writes must
Say we have 3 journals with the following txids
JN1-100, JN2-100, JN3-0 (JN3 just joined)
Now say some stuff got written to JN2 and JN3 (quorum commit with JN1 in flight records in
the queue because JN1 is slow)
JN1-100, JN2-110, JN3-110+syncing_holes
At this point something terrible happens and when we recover, we can only access JN1 and JN3
JN1-100, JN3-110+syncing holes
At this point of time how do we resolve the ground truth about the journal state and edit

> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message