hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mathias Herberts (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1590) Decommissioning never ends when node to decommission has blocks that are under-replicated and cannot be replicated to the expected level of replication
Date Fri, 21 Jan 2011 15:59:48 GMT
Decommissioning never ends when node to decommission has blocks that are under-replicated and
cannot be replicated to the expected level of replication
-------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HDFS-1590
                 URL: https://issues.apache.org/jira/browse/HDFS-1590
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.20.2
         Environment: Linux
            Reporter: Mathias Herberts
            Priority: Minor


On a test cluster with 4 DNs and a default repl level of 3, I recently attempted to decommission
one of the DNs. Right after the modification of the dfs.hosts.exclude file and the 'dfsadmin
-refreshNodes', I could see the blocks being replicated to other nodes.

After a while, the replication stopped but the node was not marked as decommissioned.

When running an 'fsck -files -blocks -locations' I saw that all files had a replication of
4 (which is logical given there are 4 DNs), but some of the files had an expected replication
set to 10 (those were job.jar files from M/R jobs).

I ran 'fs -setrep 3' on those files and shortly after the namenode reported the DN as decommissioned.

Shouldn't this case be checked by the NameNode when decommissioning a node? I.e considere
a node decommissioned if either one of the following is true for each block on the node being
decommissioned:

1. It is replicated more than the expected replication level.
2. It is replicated as much as possible given the available nodes, even though it is less
replicated than expected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message