hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0
Date Fri, 06 Dec 2013 19:55:37 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841600#comment-13841600
] 

Suresh Srinivas edited comment on HDFS-4114 at 12/6/13 7:54 PM:
----------------------------------------------------------------

bq. As it stands today BackupNode is the only extension of the NameNode in the current code
base.
I do not think it is a sufficient reason to retain BackupNode. If you really want to show
how Namenode can be extended, you could contribute another simpler, easier to maintain example
that extends Namenode. In fact some of the constructs that are used only by BackupNode, I
reckon, are not what extensions of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. This is no longer
necessary with the improvements in edits, where checkpointing can be done any time without
the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. This code is
not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of code. This code
belongs to a functionality that no one tests or uses. In fact I will not be surprised that
there are bugs lurking in that functionality that might cause major issues for a misguided
user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code that helps extending
namenode is being removed, I would like to see a proposal on what extending a namenode means,
which of the functionality relevant to that is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, which is on
my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant anymore. Describing
that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it in the future.
I wish just assigning a bug to you would have been that easy. When making changes in the code,
with a feature in mind, there are lot of these unused code and tests that also need change.
This is currently a tax that feature developers are paying. The folks working on a feature
have a time frame that they are working towards. Having to depend on you for related changes
means, having to co-ordinate the work with you, getting the work done within the timeline.
This will not only be work for you, but also work for people working on features. It is hard
for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that is only used
by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just understanding how all
this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific functionality
when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular contributors
of HDFS will have to continue pay this cost. We have waited almost an year for a plan for
taking BackupNode forward. I also think with Namenode HA stabilizing, even if there is a plan,
I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you could maintain
it. This in essence is equivalent to involving you to maintain BackupNode related functionality
for features added to HDFS, without the cost of co-ordination.


was (Author: sureshms):
bq. As it stands today BackupNode is the only extension of the NameNode in the current code
base.
I do not think it is a sufficient reason to retain BackupNode. If you really want to shot
how Namenode can be extended, you could contribute another simpler, easier to maintain example
that extends Namenode. In fact some of the constructs that are used only by BackupNode, I
reckon, are not what extensions of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. This is no longer
necessary with the improvements in edits, where checkpointing can be done any time without
the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. This code is
not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of code. This code
belongs to a functionality that no one tests or uses. In fact I will not be surprised that
there are bugs lurking in that functionality that might cause major issues for a misguided
user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code that helps extending
namenode is being removed, I would like to see a proposal on what extending a namenode means,
which of the functionality relevant to that is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, which is on
my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant anymore. Describing
that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it in the future.
I wish just assigning a bug to you would have been that easy. When making changes in the code,
with a feature in mind, there are lot of these unused code and tests that also need change.
This is currently a tax that feature developers are paying. The folks working on a feature
have a time frame that they are working towards. Having to depend on you for related changes
means, having to co-ordinate the work with you, getting the work done within the timeline.
This will not only be work for you, but also work for people working on features. It is hard
for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that is only used
by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just understanding how all
this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific functionality
when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular contributors
of HDFS will have to continue pay this cost. We have waited almost an year for a plan for
taking BackupNode forward. I also think with Namenode HA stabilizing, even if there is a plan,
I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you could maintain
it. This in essence is equivalent to involving you to maintain BackupNode related functionality
for features added to HDFS, without the cost of co-ordination.

> Deprecate the BackupNode and CheckpointNode in 2.0
> --------------------------------------------------
>
>                 Key: HDFS-4114
>                 URL: https://issues.apache.org/jira/browse/HDFS-4114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Eli Collins
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-4114.patch
>
>
> Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the BackupNode and
CheckpointNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message