Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 37D2D200D44 for ; Mon, 20 Nov 2017 10:11:08 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 364BE160BF9; Mon, 20 Nov 2017 09:11:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7CF21160BEC for ; Mon, 20 Nov 2017 10:11:07 +0100 (CET) Received: (qmail 2972 invoked by uid 500); 20 Nov 2017 09:11:06 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 2959 invoked by uid 99); 20 Nov 2017 09:11:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Nov 2017 09:11:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 8D0A5C082A for ; Mon, 20 Nov 2017 09:11:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.011 X-Spam-Level: X-Spam-Status: No, score=-99.011 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KB_WAM_FROM_NAME_SINGLEWORD=0.2, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id HtuPdzMGx0aH for ; Mon, 20 Nov 2017 09:11:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 356715FE5C for ; Mon, 20 Nov 2017 09:11:04 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A618FE0E28 for ; Mon, 20 Nov 2017 09:11:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 43312240E2 for ; Mon, 20 Nov 2017 09:11:01 +0000 (UTC) Date: Mon, 20 Nov 2017 09:11:01 +0000 (UTC) From: "Gang Xie (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-12820) Decommissioned datanode is counted in service cause datanode allcating failure MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 20 Nov 2017 09:11:08 -0000 [ https://issues.apache.org/jira/browse/HDFS-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258975#comment-16258975 ] Gang Xie commented on HDFS-12820: --------------------------------- And I believe this issue still in the latest version > Decommissioned datanode is counted in service cause datanode allcating failure > ------------------------------------------------------------------------------ > > Key: HDFS-12820 > URL: https://issues.apache.org/jira/browse/HDFS-12820 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement > Affects Versions: 2.4.0 > Reporter: Gang Xie > > When allocate a datanode when dfsclient write with considering the load, it checks if the datanode is overloaded by calculating the average xceivers of all the in service datanode. But if the datanode is decommissioned and become dead, it's still treated as in service, which make the average load much more than the real one especially when the number of the decommissioned datanode is great. In our cluster, 180 datanode, and 100 of them decommissioned, and the average load is 17. This failed all the datanode allocation. > private void subtract(final DatanodeDescriptor node) { > capacityUsed -= node.getDfsUsed(); > blockPoolUsed -= node.getBlockPoolUsed(); > xceiverCount -= node.getXceiverCount(); > {color:red} if (!(node.isDecommissionInProgress() || node.isDecommissioned())) {{color} > nodesInService--; > nodesInServiceXceiverCount -= node.getXceiverCount(); > capacityTotal -= node.getCapacity(); > capacityRemaining -= node.getRemaining(); > } else { > capacityTotal -= node.getDfsUsed(); > } > cacheCapacity -= node.getCacheCapacity(); > cacheUsed -= node.getCacheUsed(); > } -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org