hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup
Date Wed, 21 Dec 2011 06:19:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173895#comment-13173895
] 

Todd Lipcon commented on HDFS-2291:
-----------------------------------

I plan to start working on this tomorrow. My thinking is to have a checkpoint thread which
wakes up on the checkpoint interval, stops the edit log tailer thread, enters safe mode, creates
a checkpoint, and comes back out of safemode. If at any point the SB needs to process a failover,
it will cancel the checkpoint (using the HDFS-2507 feature) and proceed as usual.

The remaining question I've yet to figure out is whether it should (a) save the checkpoints
into the shared edits directory, or (b) save in its own and then upload the checkpoints to
the primary via HTTP just like the 2NN does today.

"b" is probably preferable since the shared edits directory may in fact be BK or some other
journal plugin in the future, whereas "a" would break the abstraction.

If anyone has any strong opinions please shout now :)
                
> HA: Checkpointing in an HA setup
> --------------------------------
>
>                 Key: HDFS-2291
>                 URL: https://issues.apache.org/jira/browse/HDFS-2291
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Aaron T. Myers
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>
> We obviously need to create checkpoints when HA is enabled. One thought is to use a third,
dedicated checkpointing node in addition to the active and standby nodes. Another option would
be to make the standby capable of also performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message