hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica
Date Wed, 13 Jul 2011 07:44:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064409#comment-13064409

Matt Foley commented on HDFS-2114:

Hi John, 

1. I know you didn't write TestDecommission.checkFile(), but your addition in this patch of
the {{isNodeDown}} flag raises a number of questions:
* Aren't (isNodeDown && nodes[j].getName().equals(downnode)) and (nodes[j].getName().equals(downnode))
always the same?  So is the first use of {{isNodeDown}} even necessary?
* Can there be any cases where (nodes[j].getName().equals(downnode)) and (nodes[j].isDecommissioned())
are different? So can't the following two blocks be merged?  And the location of the "is decommissioned"
log message further confuses this issue.
        if (isNodeDown && nodes[j].getName().equals(downnode)) {
          LOG.info("Block " + blk.getBlock() + " replica " + nodes[j].getName()
              + " is decommissioned.");
        if (nodes[j].isDecommissioned()) {
          if (firstDecomNodeIndex == -1) {
            firstDecomNodeIndex = j;
* What is the purpose of the assertion
          assertEquals("Decom node is not at the end", firstDecomNodeIndex, -1);
And why does it even work, since the node to decommission is chosen at random?
* And in the same block, why is it important to condition it on {{isNodeDown}}, since (!isNodeDown)
implies there shouldn't be any decommissioned nodes?  So the second use of {{isNodeDown}}
also seems unnecessary.

2. Regarding the timeout:  In TestDecommission, you appropriately set BLOCKREPORT_INTERVAL_MSEC
down to 1 sec to match the HEARTBEAT_INTERVAL.  You may also want to consider DFS_NAMENODE_REPLICATION_INTERVAL_KEY
(default 3 sec).  Adjusting this value to 1 might allow testRecommission() to run in 5 sec
instead of 10.

3. I wish there were a clear way to share the duplicated code between testDecommission and
testRecommission, but I couldn't see an obviously better choice.  Can you think of a way to
improve this?

FSNamesystem and BlockManager:

The implementation looks okay to me.

The failure of unit test TestTrash.testTrashEmptier does not seem to be related to this change.
 It is probably related to HDFS-7326, although the symptoms reported are slightly different.

> re-commission of a decommissioned node does not delete excess replica
> ---------------------------------------------------------------------
>                 Key: HDFS-2114
>                 URL: https://issues.apache.org/jira/browse/HDFS-2114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>         Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch
> If a decommissioned node is removed from the decommissioned list, namenode does not delete
the excess replicas it created while the node was decommissioned.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message