hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12820) Decommissioned datanode is counted in service cause datanode allcating failure
Date Mon, 20 Nov 2017 09:12:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258974#comment-16258974
] 

Gang Xie edited comment on HDFS-12820 at 11/20/17 9:11 AM:
-----------------------------------------------------------

After carefully check the issue reported in HDFS-9279, and found this issue is not its dup.

this case is mainly about the param {color:#d04437}nodesInService{color}. When a datanode
is decommissioned and then dead. nodesInService will not be subtracted.  
Then when allocation, the dead node will be counted in the maxload, which makes the maxload
very low, in turns, causes any allocation failing.

    if (considerLoad) {
{color:#d04437}      final double maxLoad = maxLoadRatio * stats.getInServiceXceiverAverage();{color}
      final int nodeLoad = node.getXceiverCount();
      if (nodeLoad > maxLoad) {
        logNodeIsNotChosen(storage,
            "the node is too busy (load:"+nodeLoad+" > "+maxLoad+") ");
        stats.incrOverLoaded();
        return false;
      }
    }


was (Author: xiegang112):
After carefully check the issue reported in HDFS-9279, and found this issue is not a its dup.

this case is mainly about the param {color:#d04437}nodesInService{color}. When a datanode
is decommissioned and then dead. nodesInService will not be subtracted.  
Then when allocation, the dead node will be counted in the maxload, which makes the maxload
very low, in turns, causes any allocation failing.

    if (considerLoad) {
{color:#d04437}      final double maxLoad = maxLoadRatio * stats.getInServiceXceiverAverage();{color}
      final int nodeLoad = node.getXceiverCount();
      if (nodeLoad > maxLoad) {
        logNodeIsNotChosen(storage,
            "the node is too busy (load:"+nodeLoad+" > "+maxLoad+") ");
        stats.incrOverLoaded();
        return false;
      }
    }

> Decommissioned datanode is counted in service cause datanode allcating failure
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-12820
>                 URL: https://issues.apache.org/jira/browse/HDFS-12820
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: block placement
>    Affects Versions: 2.4.0
>            Reporter: Gang Xie
>
> When allocate a datanode when dfsclient write with considering the load, it checks if
the datanode is overloaded by calculating the average xceivers of all the in service datanode.
But if the datanode is decommissioned and become dead, it's still treated as in service, which
make the average load much more than the real one especially when the number of the decommissioned
datanode is great. In our cluster, 180 datanode, and 100 of them decommissioned, and the average
load is 17. This failed all the datanode allocation. 
> private void subtract(final DatanodeDescriptor node) {
>       capacityUsed -= node.getDfsUsed();
>       blockPoolUsed -= node.getBlockPoolUsed();
>       xceiverCount -= node.getXceiverCount();
>     {color:red}  if (!(node.isDecommissionInProgress() || node.isDecommissioned())) {{color}
>         nodesInService--;
>         nodesInServiceXceiverCount -= node.getXceiverCount();
>         capacityTotal -= node.getCapacity();
>         capacityRemaining -= node.getRemaining();
>       } else {
>         capacityTotal -= node.getDfsUsed();
>       }
>       cacheCapacity -= node.getCacheCapacity();
>       cacheUsed -= node.getCacheUsed();
>     }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message