Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of david.ritch@gmail.com
 designates 209.85.221.193 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:user-agent:mime-version:to:subject:references
         :in-reply-to:x-enigmail-version:content-type
         :content-transfer-encoding;
        b=w8OvpkMBMc4DVyE+F9TTfFjCTu/E+XP0UEKHdIo7FFdTeAZa6onsOQpzoYLkLTjs1j
         Y/HGwFxOVsWN0fTptnlyohl6y2eXj3u52IMIfHMpt+Gcu1mCHpn520vgP7H0pWTIM0SB
         71FROzuJzDH05fFxhuXifTnyVaAw+f7feTY3g=
Message-ID: <4AA9BEC9.3000602@gmail.com>
Date: Thu, 10 Sep 2009 23:06:49 -0400
From: "David B. Ritch" <david.ritch@gmail.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
 rv:1.9.1.1) Gecko/20090715 Thunderbird/3.0b3
MIME-Version: 1.0
To: common-user@hadoop.apache.org
Subject: Re: Decommissioning Individual Disks
References: <4AA9A839.2050600@gmail.com>
	 <623d9cf40909101839x2ed67754r8c58033573f278af@mail.gmail.com>
 <35a22e220909101907x96abbb8h430f58d11a7d5ea2@mail.gmail.com>
In-Reply-To: <35a22e220909101907x96abbb8h430f58d11a7d5ea2@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Thank you both.  That's what we did today.  It seems fairly reasonable
when a node has a few disks, say 3-5.  However, at some sites, with
larger nodes, it seems more awkward.  When a node has a dozen or more
disks (as used in the larger terasort benchmarks), migrating the data
off all the disks is likely to be more of an issue.  I hope that there
is a better solution to this before my client moves to much larger
nodes!  ;-)

dbr

On 9/10/2009 10:07 PM, Amandeep Khurana wrote:
> I think decommissioning the node and replacing the disk is a cleaner
> approach. That's what I'd recommend doing as well..
>
> On 9/10/09, Alex Loddengaard <alex@cloudera.com> wrote:
>   
>> Hi David,
>> Unfortunately there's really no way to do what you're hoping to do in an
>> automatic way.  You can move the block files (including their .meta files)
>> from one disk to another.  Do this when the datanode daemon is stopped.
>>  Then, when you start the datanode daemon, it will scan dfs.data.dir and be
>> totally happy if blocks have moved hard drives.  I've never tried to do this
>> myself, but others on the list have suggested this technique for "balancing
>> disks."
>>
>> You could also change your process around a little.  It's not too crazy to
>> decommission an entire node, replace one of its disks, then bring it back
>> into the cluster.  Seems to me that this is a much saner approach: your ops
>> team will tell you which disk needs replacing.  You decommission the node,
>> they replace the disk, you add the node back to the pool.  Your call I
>> guess, though.
>>
>> Hope this was helpful.
>>
>> Alex
>>
>> On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch
>> <david.ritch@gmail.com>wrote:
>>
>>     
>>> What do you do with the data on a failing disk when you replace it?
>>>
>>> Our support person comes in occasionally, and often replaces several
>>> disks when he does.  These are disks that have not yet failed, but
>>> firmware indicates that failure is imminent.  We need to be able to
>>> migrate our data off these disks before replacing them.  If we were
>>> replacing entire servers, we would decommission them - but we have 3
>>> data disks per server.  If we were replacing one disk at a time, we
>>> wouldn't worry about it (because of redundancy).  We can decommission
>>> the servers, but moving all the data off of all their disks is a waste.
>>>
>>> What's the best way to handle this?
>>>
>>> Thanks!
>>>
>>> David
>>>
>>>       
>>     
>
>