From hdfs-issues-return-273324-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org  Tue Jul 23 09:45:03 2019
Return-Path: <hdfs-issues-return-273324-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 604A61802C7
	for <archive-asf-public@cust-asf.ponee.io>; Tue, 23 Jul 2019 11:45:03 +0200 (CEST)
Received: (qmail 65893 invoked by uid 500); 23 Jul 2019 09:45:02 -0000
Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:hdfs-issues-help@hadoop.apache.org>
List-Unsubscribe: <mailto:hdfs-issues-unsubscribe@hadoop.apache.org>
List-Post: <mailto:hdfs-issues@hadoop.apache.org>
List-Id: <hdfs-issues.hadoop.apache.org>
Delivered-To: mailing list hdfs-issues@hadoop.apache.org
Received: (qmail 65803 invoked by uid 99); 23 Jul 2019 09:45:02 -0000
Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Jul 2019 09:45:02 +0000
Received: from jira-lw-us.apache.org (unknown [207.244.88.139])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EC6B5E2F2E
	for <hdfs-issues@hadoop.apache.org>; Tue, 23 Jul 2019 09:45:00 +0000 (UTC)
Received: from jira-lw-us.apache.org (localhost [127.0.0.1])
	by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5B9E5265D0
	for <hdfs-issues@hadoop.apache.org>; Tue, 23 Jul 2019 09:45:00 +0000 (UTC)
Date: Tue, 23 Jul 2019 09:45:00 +0000 (UTC)
From: "Chen Zhang (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.13118771.1510797357000.31199.1563875100372@Atlassian.JIRA>
In-Reply-To: <JIRA.13118771.1510797357000@Atlassian.JIRA>
References: <JIRA.13118771.1510797357000@Atlassian.JIRA> <JIRA.13118771.1510797357718@jira-lw-us.apache.org>
Subject: [jira] [Comment Edited] (HDFS-12820) Decommissioned datanode is
 counted in service cause datanode allcating failure
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/HDFS-12820?page=3Dcom.atlassian=
.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1689=
0847#comment-16890847 ]=20

Chen Zhang edited comment on HDFS-12820 at 7/23/19 9:44 AM:
------------------------------------------------------------

Hi [~jojochuang], I've checked the code of the trunk branch, I think this i=
ssue still exists on the latest version

If we decommission a datanode and then stop it, the *nodesInService* of Dat=
anodeStats variable is not subtracted, see the follow code:
{code:java}
synchronized void subtract(final DatanodeDescriptor node) {
  xceiverCount -=3D node.getXceiverCount();
  if (node.isInService()) { //Admin.DECOMMISSIONED is not count as isInServ=
ice
    capacityUsed -=3D node.getDfsUsed();
    capacityUsedNonDfs -=3D node.getNonDfsUsed();
    blockPoolUsed -=3D node.getBlockPoolUsed();
    nodesInService--;
    nodesInServiceXceiverCount -=3D node.getXceiverCount();
    capacityTotal -=3D node.getCapacity();
    capacityRemaining -=3D node.getRemaining();
    cacheCapacity -=3D node.getCacheCapacity();
    cacheUsed -=3D node.getCacheUsed();
  } else if (node.isDecommissionInProgress() ||
    node.isEnteringMaintenance()) {
    cacheCapacity -=3D node.getCacheCapacity();
    cacheUsed -=3D node.getCacheUsed();
  }
  ...
}{code}
so If we have a cluster of 100 nodes and we decommission and stopped 50 nod=
es, the *nodeInService* variable will still be 100, this would makes the va=
lue stats.getInServiceXceiverAverage returns is only half of real "average =
xceiver count", which will cause most nodes become overloaded in the follow=
ing code
{code:java}
boolean excludeNodeByLoad(DatanodeDescriptor node){
  final double maxLoad =3D considerLoadFactor *
  stats.getInServiceXceiverAverage(); //calculated by total-xceiverCount/no=
desInService
  final int nodeLoad =3D node.getXceiverCount();
  if ((nodeLoad > maxLoad) && (maxLoad > 0)) {
    logNodeIsNotChosen(node, NodeNotChosenReason.NODE_TOO_BUSY,
      "(load: " + nodeLoad + " > " + maxLoad + ")");
    return true;
  }
  return false;
}
{code}
=C2=A0


was (Author: zhangchen):
Hi [~jojochuang], I've checked the code of the trunk branch, I think this i=
ssue still exists on the latest version

If we decommission a datanode and then stop it, the nodesInService of Datan=
odeStats variable is not subtracted, see the follow code:

=C2=A0
{code:java}
synchronized void subtract(final DatanodeDescriptor node) {
  xceiverCount -=3D node.getXceiverCount();
  if (node.isInService()) { //Admin.DECOMMISSIONED is not count as isInServ=
ice
    capacityUsed -=3D node.getDfsUsed();
    capacityUsedNonDfs -=3D node.getNonDfsUsed();
    blockPoolUsed -=3D node.getBlockPoolUsed();
    nodesInService--;
    nodesInServiceXceiverCount -=3D node.getXceiverCount();
    capacityTotal -=3D node.getCapacity();
    capacityRemaining -=3D node.getRemaining();
    cacheCapacity -=3D node.getCacheCapacity();
    cacheUsed -=3D node.getCacheUsed();
  } else if (node.isDecommissionInProgress() ||
    node.isEnteringMaintenance()) {
    cacheCapacity -=3D node.getCacheCapacity();
    cacheUsed -=3D node.getCacheUsed();
  }
  ...
}{code}
so If we have a cluster of 100 nodes and we decommission and stopped 50 nod=
es, the nodeInService variable will still be 100, this would makes the valu=
e stats.getInServiceXceiverAverage returns is only half of real "average xc=
eiver count", which will cause most nodes become overloaded in the followin=
g code
{code:java}
boolean excludeNodeByLoad(DatanodeDescriptor node){
  final double maxLoad =3D considerLoadFactor *
  stats.getInServiceXceiverAverage(); //calculated by total-xceiverCount/no=
desInService
  final int nodeLoad =3D node.getXceiverCount();
  if ((nodeLoad > maxLoad) && (maxLoad > 0)) {
    logNodeIsNotChosen(node, NodeNotChosenReason.NODE_TOO_BUSY,
      "(load: " + nodeLoad + " > " + maxLoad + ")");
    return true;
  }
  return false;
}
{code}
=C2=A0

> Decommissioned datanode is counted in service cause datanode allcating fa=
ilure
> -------------------------------------------------------------------------=
-----
>
>                 Key: HDFS-12820
>                 URL: https://issues.apache.org/jira/browse/HDFS-12820
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: block placement
>    Affects Versions: 2.4.0
>            Reporter: Gang Xie
>            Priority: Major
>
> When allocate a datanode when dfsclient write with considering the load, =
it checks if the datanode is overloaded by calculating the average xceivers=
 of all the in service datanode. But if the datanode is decommissioned and =
become dead, it's still treated as in service, which make the average load =
much more than the real one especially when the number of the decommissione=
d datanode is great. In our cluster, 180 datanode, and 100 of them decommis=
sioned, and the average load is 17. This failed all the datanode allocation=
.=20
> private void subtract(final DatanodeDescriptor node) {
>       capacityUsed -=3D node.getDfsUsed();
>       blockPoolUsed -=3D node.getBlockPoolUsed();
>       xceiverCount -=3D node.getXceiverCount();
>     {color:red}  if (!(node.isDecommissionInProgress() || node.isDecommis=
sioned())) {{color}
>         nodesInService--;
>         nodesInServiceXceiverCount -=3D node.getXceiverCount();
>         capacityTotal -=3D node.getCapacity();
>         capacityRemaining -=3D node.getRemaining();
>       } else {
>         capacityTotal -=3D node.getDfsUsed();
>       }
>       cacheCapacity -=3D node.getCacheCapacity();
>       cacheUsed -=3D node.getCacheUsed();
>     }


--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org