hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
Date Tue, 03 Apr 2012 18:50:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245588#comment-13245588
] 

Suresh Srinivas commented on HDFS-3077:
---------------------------------------

Thanks for posting the design. Now I understand your comment that there is a lot of common
things between this one and the approach in HDFS-3092. Here are some high level comments:
# Terminology - JournalDaemon or JournalNode. I prefer JournalDaemon because my plan was to
run them in the same process space as the namenode. A JournalDeamon could also be stand-alone
process.
# I like the idea of quorum writes and maintaining the queue. 3092 design currently uses timeout
to declare a JD slow and fail it. We were planning to punting on it until we had first implementation.
# newEpoch() is called fence() in HDFS-3092. My preference is to use the name fence(). I was
using version # which is called epoch. I think the name epoch sounds better. The key difference
is that version # is generated from znode in HDFS-3092. So two namenodes cannot use the same
epoch number. I think there is a bug with the approach you have described, stemming from the
fact that two namenodes can use the same epoch and step 3 in 2.4 can be completed independent
of quorum. This is shown in Hari's example.
# I prefer to record epoch in startLogSegment filler record. startLogSegment record was never
part of the journal, which we had added for structural reasons. So adding epoch info to it
should not matter. The way I see it is - journal belongs to a segment. Segment has single
version # or epoch.
# In both proposals epoch or version # needs to be sent in all journal requests.

We could certainly make a list of common work items and create jiras, so that many people
can collaborate and wrap it up, like we did in HDFS-1623.

                
> Quorum-based protocol for reading and writing edit logs
> -------------------------------------------------------
>
>                 Key: HDFS-3077
>                 URL: https://issues.apache.org/jira/browse/HDFS-3077
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, name-node
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3077-partial.txt, qjournal-design.pdf
>
>
> Currently, one of the weak points of the HA design is that it relies on shared storage
such as an NFS filer for the shared edit log. One alternative that has been proposed is to
depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated
edit log on commodity hardware. This JIRA is to implement another alternative, based on a
quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only
by HDFS's needs rather than more generic use cases. More details to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message