Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C4032200C70 for ; Thu, 20 Apr 2017 00:19:50 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C2B10160BAA; Wed, 19 Apr 2017 22:19:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E4E5F160B9C for ; Thu, 20 Apr 2017 00:19:49 +0200 (CEST) Received: (qmail 82862 invoked by uid 500); 19 Apr 2017 22:19:49 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 82851 invoked by uid 99); 19 Apr 2017 22:19:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2017 22:19:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 82EB31A0464 for ; Wed, 19 Apr 2017 22:19:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Bunin4nZcT7K for ; Wed, 19 Apr 2017 22:19:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D1EC35FC62 for ; Wed, 19 Apr 2017 22:19:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3DBE6E06BB for ; Wed, 19 Apr 2017 22:19:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A03AE21B45 for ; Wed, 19 Apr 2017 22:19:41 +0000 (UTC) Date: Wed, 19 Apr 2017 22:19:41 +0000 (UTC) From: "Riza Suminto (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-11684) Potential scalability bug on datanode recommission MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 19 Apr 2017 22:19:51 -0000 [ https://issues.apache.org/jira/browse/HDFS-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Riza Suminto updated HDFS-11684: -------------------------------- Description: We are academic researcher working on scalability bug detection. Upon analyzing Hadoop code, our static analysis program detect potential scalability problem along this function calls: DatanodeManager.refreshNodes() -> DatanodeManager.refreshDatanodes() -> DecommissionManager.stopDecommission() -> BlockManager.processExtraRedundancyBlocksOnInService() DatanodeManager.refreshNodes() call namesystem.writeLock(), which will lock FSNamesystem. While locking FSNamesystem, it does nested loop over datanodes in DatanodeManager.refreshDatanodes() and over datanode blocks in BlockManager.processExtraRedundancyBlocksOnInService(). Looking only at number of executed lines, refreshDatanodes will execute N*(136*B+137) lines of codes in total, where N is number of datanodes and B is number of blocks per node. This is a potential scalability bug that may render FSNamesystem unresponsive for long time if refreshNodes trigger recommissioning of many fat datanodes, having large number of over replicated datablocks. This bug also discussed in HDFS-10477, but still left unresolved. In older hadoop version, our program also detect similar scalability problem in decommissioning datanodes, which has been fixed on HADOOP-4061. However, recommissioning still have scalabillity problem in current version. was: We are academic researcher working on scalability bug detection. Upon analyzing Hadoop code, our static analysis program detect potential scalability problem along this function calls: > DatanodeManager.refreshNodes() -> DatanodeManager.refreshDatanodes() -> DecommissionManager.stopDecommission() -> BlockManager.processExtraRedundancyBlocksOnInService() DatanodeManager.refreshNodes() call namesystem.writeLock(), which will lock FSNamesystem. While locking FSNamesystem, it does nested loop over datanodes in DatanodeManager.refreshDatanodes() and over datanode blocks in BlockManager.processExtraRedundancyBlocksOnInService(). Looking only at number of executed lines, refreshDatanodes will execute N*(136*B+137) lines of codes in total, where N is number of datanodes and B is number of blocks per node. This is a potential scalability bug that may render FSNamesystem unresponsive for long time if refreshNodes trigger recommissioning of many fat datanodes, having large number of over replicated datablocks. This bug also discussed in HDFS-10477, but still left unresolved. In older hadoop version, our program also detect similar scalability problem in decommissioning datanodes, which has been fixed on HADOOP-4061. However, recommissioning still have scalabillity problem in current version. > Potential scalability bug on datanode recommission > -------------------------------------------------- > > Key: HDFS-11684 > URL: https://issues.apache.org/jira/browse/HDFS-11684 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.7.3 > Reporter: Riza Suminto > > We are academic researcher working on scalability bug detection. > Upon analyzing Hadoop code, our static analysis program detect > potential scalability problem along this function calls: > DatanodeManager.refreshNodes() -> DatanodeManager.refreshDatanodes() -> DecommissionManager.stopDecommission() -> BlockManager.processExtraRedundancyBlocksOnInService() > DatanodeManager.refreshNodes() call namesystem.writeLock(), > which will lock FSNamesystem. While locking FSNamesystem, it does > nested loop over datanodes in DatanodeManager.refreshDatanodes() and > over datanode blocks in BlockManager.processExtraRedundancyBlocksOnInService(). > Looking only at number of executed lines, refreshDatanodes will > execute N*(136*B+137) lines of codes in total, where N is number of > datanodes and B is number of blocks per node. > This is a potential scalability bug that may render FSNamesystem > unresponsive for long time if refreshNodes trigger recommissioning of > many fat datanodes, having large number of over replicated datablocks. > This bug also discussed in HDFS-10477, but still left unresolved. > In older hadoop version, our program also detect similar scalability > problem in decommissioning datanodes, which has been fixed on > HADOOP-4061. However, recommissioning still have scalabillity problem > in current version. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org