hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4114) Remove the CheckpointNode
Date Fri, 02 Nov 2012 09:09:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489318#comment-13489318

Konstantin Shvachko commented on HDFS-4114:

> I'll re-purpose this jira to just remove the CheckpointNode.

I wonder what that means. CheckpointNode is just a role of the BackupNode in which it performs
checkpoints like SNN and does not keep the in-memory state in sync with the primary NN.
So changing the subject doesn't change the purpose.

>From formalistic perspective you cannot just remove something from core Hadoop. You first
need to deprecate it and then may remove in the next major version. That is the rule I was
following for the last 7 years. Let me know if it has changed recently. And that is why particularly
SNN was not removed but deprecated, otherwise we would have had a more efficient checkpointing
engine, see below.

I see BackupNode as a better way of creating checkpoints. SNN uploads the image and the edits
from NN, merges them in memory and then sends back the new checkpoint.
BN needs only to saveNamespace() from memory and then sends back the new image. This reduces
the network traffic and local disk IOs on the upload of two large files. I have seen on multiple
large clusters NameNode running much slower, when the checkpoint is in progress.
It is beneficial for HDFS performance to switch from SNN to BN for checkpointing. Therefore
I would advocate re-re-deprecating SNN instead of removing BN.
I accept your criticism that BackupNode code path was getting less attention from me personally
and the community at large. Will have to work on that on my side.
I would be glad to go into design discussion and potential enhancements of BackupNode with
you. Would appreciate it given your experience with HA, as I believe the HA story for Hadoop
isn't over with the implementation of Quorum Journal.
Although this issue is not about it. Sticking to the point, what are your arguments for removing
(or better say deprecating) BN besides that it has bugs? Software tends to have bugs. E.g.
you do not propose to remove BlockScanner just because it couldn't been fixed over a series
> Remove the CheckpointNode
> -------------------------
>                 Key: HDFS-4114
>                 URL: https://issues.apache.org/jira/browse/HDFS-4114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Eli Collins
>            Assignee: Eli Collins
> Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the BackupNode and

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message