hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4539) Streaming Edits to a Standby Name-Node.
Date Thu, 19 Feb 2009 10:17:01 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Konstantin Shvachko updated HADOOP-4539:

    Attachment: BackupNode.patch

This patch introduces two new types of name-nodes: a Checkpoint node and a Backup node. 
- The role of the *Checkpoint node* to checkpoint name-node meta-data by merging image and
edits files.
- The *Backup node* extends functionality of the Checkpointer by that it can receive online
updates of the file system meta-data, apply them to its memory state and persist them on disks
just like the name-node does. Thus at any time the Backup node contains an up-to-date image
of the namespace both in memory and on local disk(s).
This also results in much more efficient checkpointing because backup node does not need to
transfer files from the active name-node and does not need to replay (merge) edits.
- Term *Standby node* is reserved for further extension of the backup node functionality,
when cluster will be able to switch over to the new name-node if the active dies.
This is mentioned in the "Warm standby provision" section of the design document.

Typical use cases:
# Run Checkpoint node only to create checkpoints. This should be used instead of the current
SecondaryNameNode, which is depricated by the patch. I reused a lot of the SecondaryNameNode
code so this effort was not wasted, it just evolved.
# Run Backup node to support online streaming of edits and efficient checkpointing. 
This particularly targets eliminating NFS as a remote storage for edits.
# Run NameNode without persistent storage at all and delegate all "persisting" functionality
to the Backup node. The trick here is to start name-node with {{-importCheckpoint}} option
and then run the Backup node.

In the near term I plan to 
- attach an updated design document with all modifications and clarifications to the initial
- provide more test cases in TestBackupNode unit test;
- and perform large scale testing.

> Streaming Edits to a Standby Name-Node.
> ---------------------------------------
>                 Key: HADOOP-4539
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4539
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: BackupNode.patch, image001.gif, StreamEditsToSNN.htm
> Currently Secondary name-node acts as mere checkpointer.
> Secondary name-node should be transformed into a standby name-node (SNN). 
> The long term goal is to make it a warm standby. 
> The purpose of this issue is to provide real time streaming of edits to SNN so that it
contained the up-to-date namespace state.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message