Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Message-ID: <554970.324061294868088127.JavaMail.jira@thor>
Date: Wed, 12 Jan 2011 16:34:48 -0500 (EST)
From: "Suresh Srinivas (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Subject: [jira] Commented: (HDFS-1547) Improve decommission mechanism
In-Reply-To: <662967.224821292884142120.JavaMail.jira@thor>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980939#action_12980939 ] 

Suresh Srinivas commented on HDFS-1547:
---------------------------------------

Thinking a bit more about the problem, I think there could be issues in some cases:
Consider a cluster with N nodes, L live and D decommissioned with transceiver load on each datanode {X1, X2, ... XN}. 

A datanode is not good for write when Xi > 2 * X /(L+D)

That means when D > L, a lot of the nodes will be not eligible for writes. The remainining that are good, will have to take write load and will push X higher. Also read traffic that is not subject to the above condition will push X higher. In the worst case scenarios, if the load on every node is equal to X and write load dominates reads, then very few or no nodes are good for writes!


Some observations:
# This problem is severe as D gets closer to and more than N/2.
# Doing such a decommission of large number datanodes has several issues:
#* It reduces cluster available free storage for writes. Writes could simply fail because of no free storage. The decommissioning may not complete, because of lack of free storage. 
#* Further when this happens, the number nodes available for writes is significantly reduced (as writes are not done to D nodes).
#* Note this problem also exists when decommissioning is in progress for large number of nodes.

Given this I am leaning towards not handling this case.


> Improve decommission mechanism
> ------------------------------
>
>                 Key: HDFS-1547
>                 URL: https://issues.apache.org/jira/browse/HDFS-1547
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1547.1.patch, HDFS-1547.patch
>
>
> Current decommission mechanism driven using exclude file has several issues. This bug proposes some changes in the mechanism for better manageability. See the proposal in the next comment for more details.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.