hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John George (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica
Date Wed, 13 Jul 2011 23:53:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064963#comment-13064963
] 

John George commented on HDFS-2114:
-----------------------------------

Mat, Thanks a lot for reviewing this patch. I will post a newer patch tomorrow with changes
to checkFile(). I have tried to integrate your comments.

> 1. I know you didn't write TestDecommission.checkFile(), but your addition in 
> this patch of the {{isNodeDown}} flag raises a number of questions:
> * Aren't (isNodeDown && nodes[j].getName().equals(downnode)) and 
> (nodes[j].getName().equals(downnode)) always the same?  So is the first use of 
> {{isNodeDown}} even necessary?

The first use of {isNodeDown} is necessary because {downnode} could be null in cases when
we are checking for "Recommission".

> * Can there be any cases where (nodes[j].getName().equals(downnode)) and 
> (nodes[j].isDecommissioned()) are different? So can't the following two blocks 
> be merged?  And the location of the "is decommissioned" log message further 
> confuses this issue.
> {code}
>         if (isNodeDown && nodes[j].getName().equals(downnode)) {
>           hasdown++;
>           LOG.info("Block " + blk.getBlock() + " replica " + 
> nodes[j].getName()
>               + " is decommissioned.");
>         }
>         if (nodes[j].isDecommissioned()) {
>           if (firstDecomNodeIndex == -1) {
>             firstDecomNodeIndex = j;
>           }
>           continue;
>         }
> {code}
I think you are right. I will change the code to assert if they are not the same.

> * What is the purpose of the assertion
> {code}
>         if(isNodeDown)
>           assertEquals("Decom node is not at the end", firstDecomNodeIndex, 
> -1);
> {code}

My understanding is that this assert ensures that the current blk has only ONE replica that
is in decommissioned state.

> And why does it even work, since the node to decommission is chosen at random?
> * And in the same block, why is it important to condition it on 
> {{isNodeDown}}, since (!isNodeDown) implies there shouldn't be any 
> decommissioned nodes?  So the second use of {{isNodeDown}} also seems 
> unnecessary.
Because, we don't care for this in cases where we there is no node that is down.

> 
> 2. Regarding the timeout:  In TestDecommission, you appropriately set 
> BLOCKREPORT_INTERVAL_MSEC down to 1 sec to match the HEARTBEAT_INTERVAL.  You 
> may also want to consider DFS_NAMENODE_REPLICATION_INTERVAL_KEY (default 3 
> sec).  Adjusting this value to 1 might allow testRecommission() to run in 5 
> sec instead of 10.
> 
Will try this.

> 3. I wish there were a clear way to share the duplicated code between 
> testDecommission and testRecommission, but I couldn't see an obviously better 
> choice.  Can you think of a way to improve this?
ok

> re-commission of a decommissioned node does not delete excess replica
> ---------------------------------------------------------------------
>
>                 Key: HDFS-2114
>                 URL: https://issues.apache.org/jira/browse/HDFS-2114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>         Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch
>
>
> If a decommissioned node is removed from the decommissioned list, namenode does not delete
the excess replicas it created while the node was decommissioned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message