Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 2 Feb 2015 22:48:35 +0000 (UTC)
From: "Andrew Wang (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12756585.1416442439000.234545.1422917315956@Atlassian.JIRA>
In-Reply-To: <JIRA.12756585.1416442439000@Atlassian.JIRA>
References: <JIRA.12756585.1416442439000@Atlassian.JIRA>
 <JIRA.12756585.1416442439700@arcas>
Subject: [jira] [Commented] (HDFS-7411) Refactor and improve decommissioning
 logic into DecommissionManager
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302092#comment-14302092 ] 

Andrew Wang commented on HDFS-7411:
-----------------------------------

As discussed above, the old limiting scheme is seriously flawed. The amount of time spent is highly variable, since it's # nodes rather than # blocks, and the size of each node is variable. It also counts both decommissioning and not decommissioning nodes towards the limit.

That nodes can vary in # of blocks and is really an argument for *not* using # nodes as a limit. # of blocks is superior. The 100k was chosen as a conservative number that will not lead to overly long wake-up times, which is the point of this limit. In fact, with this patch we should see far more predictable pause times for decommission work even with the old config. In addition, it'll also result in an improvement in overall decommission speed because of the incremental scan logic.

Because of this, I do not see any advantage to keeping this old code around. The old code is worse in terms of predictable pause times and overall decommissioning speed. It also has other flaws that are corrected by this patch. The new code is compatible with the old configuration. It also requires a lot of work to split the refactoring.

I still plan to commit tomorrow.

> Refactor and improve decommissioning logic into DecommissionManager
> -------------------------------------------------------------------
>
>                 Key: HDFS-7411
>                 URL: https://issues.apache.org/jira/browse/HDFS-7411
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.5.1
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch, hdfs-7411.010.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to DecommissionManager.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)