hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Mankude (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Tue, 03 Apr 2012 18:36:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245578#comment-13245578

Hari Mankude commented on HDFS-3077:


The doc is excellent. Had a comment on a potential issue which could result due to epochnumber
with certain failure scenarios. Specifically, I am talking about the scenario in section 2.5.6

J1 is at txid 153, J2 is at txid 150 and J3 is at txid 125. Epochnumber on all the journals
is 1. Both NN1 and NN2 are trying to become_active() at the same time. NN1 talks to J1, J2
and sets the proposedEpoch to 2. NN2 talks to J2 and J3 and decides to set the proposedEpoch
to 2.

NN1 succeeds in setting newEpoch to 2 on J1 and fails on J2 and J3. NN1 dies since it does
not have quorum.
NN2 succeeds in setting newEpoch to 2 on J2 and J3 and has the quorum. NN2 cannot talk to
J1. Similar to the scenario in 2.5.6, NN2 writes 151, 152,153 into J2 and J3 and then dies.

So currently, state is epoch number is 2 on all the journals and J1, J2 and J3 are at 153.
 We have a problem since it is not possible to distinguish between log entries in J1 vs J2
and J3.


> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message