Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 3 Feb 2014 19:11:42 +0000 (UTC)
From: "Todd Lipcon (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12618208.1354218040367.28650.1391454702626@arcas>
In-Reply-To: <JIRA.12618208.1354218040367@arcas>
References: <JIRA.12618208.1354218040367@arcas>
Subject: [jira] [Commented] (HDFS-4239) Means of telling the datanode to
 stop using a sick disk
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889769#comment-13889769 ] 

Todd Lipcon commented on HDFS-4239:
-----------------------------------

Couple quick high-level comments:

- what's the authorization requirement here? The patch doesn't seem to do any access control, but I wouldn't want a non-admin to make these changes.
- it seems odd that the "mark this volume dead" is non-persistent across restarts. If a disk is "dying", I'm nervous that someone would mark it bad, and then a later rolling restart of the service would revive it. Something like a config file of "blacklisted volume IDs" and a 'refresh' RPC might be more resistant to this type of issue -- or a marker file like "disallow_this_volume" in the storage directory?

> Means of telling the datanode to stop using a sick disk
> -------------------------------------------------------
>
>                 Key: HDFS-4239
>                 URL: https://issues.apache.org/jira/browse/HDFS-4239
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Jimmy Xiang
>         Attachments: hdfs-4239.patch, hdfs-4239_v2.patch, hdfs-4239_v3.patch
>
>
> If a disk has been deemed 'sick' -- i.e. not dead but wounded, failing occasionally, or just exhibiting high latency -- your choices are:
> 1. Decommission the total datanode.  If the datanode is carrying 6 or 12 disks of data, especially on a cluster that is smallish -- 5 to 20 nodes -- the rereplication of the downed datanode's data can be pretty disruptive, especially if the cluster is doing low latency serving: e.g. hosting an hbase cluster.
> 2. Stop the datanode, unmount the bad disk, and restart the datanode (You can't unmount the disk while it is in use).  This latter is better in that only the bad disk's data is rereplicated, not all datanode data.
> Is it possible to do better, say, send the datanode a signal to tell it stop using a disk an operator has designated 'bad'.  This would be like option #2 above minus the need to stop and restart the datanode.  Ideally the disk would become unmountable after a while.
> Nice to have would be being able to tell the datanode to restart using a disk after its been replaced.


--
This message was sent by Atlassian JIRA
(v6.1.5#6160)