hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica
Date Wed, 13 Jul 2011 07:44:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064409#comment-13064409
] 

Matt Foley commented on HDFS-2114:
----------------------------------

Hi John, 
TestDecommission.java:

1. I know you didn't write TestDecommission.checkFile(), but your addition in this patch of
the {{isNodeDown}} flag raises a number of questions:
* Aren't (isNodeDown && nodes[j].getName().equals(downnode)) and (nodes[j].getName().equals(downnode))
always the same?  So is the first use of {{isNodeDown}} even necessary?
* Can there be any cases where (nodes[j].getName().equals(downnode)) and (nodes[j].isDecommissioned())
are different? So can't the following two blocks be merged?  And the location of the "is decommissioned"
log message further confuses this issue.
{code}
        if (isNodeDown && nodes[j].getName().equals(downnode)) {
          hasdown++;
          LOG.info("Block " + blk.getBlock() + " replica " + nodes[j].getName()
              + " is decommissioned.");
        }
        if (nodes[j].isDecommissioned()) {
          if (firstDecomNodeIndex == -1) {
            firstDecomNodeIndex = j;
          }
          continue;
        }
{code}
* What is the purpose of the assertion
{code}
        if(isNodeDown)
          assertEquals("Decom node is not at the end", firstDecomNodeIndex, -1);
{code}
And why does it even work, since the node to decommission is chosen at random?
* And in the same block, why is it important to condition it on {{isNodeDown}}, since (!isNodeDown)
implies there shouldn't be any decommissioned nodes?  So the second use of {{isNodeDown}}
also seems unnecessary.

2. Regarding the timeout:  In TestDecommission, you appropriately set BLOCKREPORT_INTERVAL_MSEC
down to 1 sec to match the HEARTBEAT_INTERVAL.  You may also want to consider DFS_NAMENODE_REPLICATION_INTERVAL_KEY
(default 3 sec).  Adjusting this value to 1 might allow testRecommission() to run in 5 sec
instead of 10.

3. I wish there were a clear way to share the duplicated code between testDecommission and
testRecommission, but I couldn't see an obviously better choice.  Can you think of a way to
improve this?

FSNamesystem and BlockManager:

The implementation looks okay to me.

The failure of unit test TestTrash.testTrashEmptier does not seem to be related to this change.
 It is probably related to HDFS-7326, although the symptoms reported are slightly different.

> re-commission of a decommissioned node does not delete excess replica
> ---------------------------------------------------------------------
>
>                 Key: HDFS-2114
>                 URL: https://issues.apache.org/jira/browse/HDFS-2114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>         Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch
>
>
> If a decommissioned node is removed from the decommissioned list, namenode does not delete
the excess replicas it created while the node was decommissioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message