Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 25781 invoked from network); 4 Apr 2011 11:21:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Apr 2011 11:21:35 -0000 Received: (qmail 49741 invoked by uid 500); 4 Apr 2011 11:21:35 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 49675 invoked by uid 500); 4 Apr 2011 11:21:34 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 49667 invoked by uid 99); 4 Apr 2011 11:21:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2011 11:21:34 +0000 X-ASF-Spam-Status: No, hits=3.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eltonsky9404@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-wy0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2011 11:21:28 +0000 Received: by wyb40 with SMTP id 40so6948098wyb.35 for ; Mon, 04 Apr 2011 04:21:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=C8wfDzXjonfD3neYFsfjXPGUfPS5jYaGauK//1WjWH4=; b=KYaf7421JCWYILPEy6obIUF983Ae01+dxtt/WwS+6yy+9Fu00G7enduYFucWzQpQuN 8lZ7taxHQPixFTHpMtdT/upaTpQsz6ziguBhWqWHrTljG6OimDJHKBXbASA6RfwRZUxr 0qq8sY9Nr3oeg55rvJcZ+RZ0T5dMX/wwgcW7g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=wM9oHLSIZ2RnB0PaLFtkTnBsFj9aLiFQ4ANp5Ia+1YrcQwHReDipIvHJeJApmzsOyn j9qQpKLAEi7Zaj9Nb+QdiGu7T44y6QLIIl8wG+2+Lwy2nVL5mYsp+Wj6tLEw+EPOPhKi 4Wn/BSZlY4vrA1s3eJKvnriD+s4upKHzRd9ug= MIME-Version: 1.0 Received: by 10.227.201.130 with SMTP id fa2mr2719323wbb.172.1301916067193; Mon, 04 Apr 2011 04:21:07 -0700 (PDT) Received: by 10.227.3.21 with HTTP; Mon, 4 Apr 2011 04:21:07 -0700 (PDT) In-Reply-To: References: Date: Mon, 4 Apr 2011 21:21:07 +1000 Message-ID: Subject: Re: Remove one directory from multiple dfs.data.dir, how? From: elton sky To: hdfs-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00151758a392b6628c04a015f453 --00151758a392b6628c04a015f453 Content-Type: text/plain; charset=ISO-8859-1 Thanks Harsh, I will give it a go as you suggested. But I feel it's not convenient in my case. Decommission is for taking down a node. What I am doing here is taking out a dir. In my case, all I need to do is copy files in the dir I want to remove to remaining dirs on the node, isn't it? Why not hadoop has this functionality? On Mon, Apr 4, 2011 at 5:05 PM, Harsh Chouraria wrote: > Hello Elton, > > On Mon, Apr 4, 2011 at 11:44 AM, elton sky wrote: > > Now I want to remove 1 disk from each node, say /data4/hdfs-data. What I > > should do to keep data integrity? > > -Elton > > This can be done using the reliable 'decommission' process, by > recommissioning them after having reconfigured (multiple nodes may be > taken down per decommission round this way, but be wary of your > cluster's actual used data capacity, and your minimum replication > factors). Read more about the decommission processes here: > > http://hadoop.apache.org/hdfs/docs/r0.21.0/hdfs_user_guide.html#DFSAdmin+Command > and http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission > > You may also have to run a cluster-wide balancer of DNs after the > entire process is done, to get rid of some skew in the distribution of > data across them. > > (P.s. As an alternative solution, you may bring down one DataNode at a > time, reconfigure it individually, and bring it up again; then repeat > with the next one once NN's fsck reports a healthy situation again (no > under-replicated blocks). But decommissioning is the guaranteed safe > way and is easier to do for some bulk of nodes.) > > -- > Harsh J > Support Engineer, Cloudera > --00151758a392b6628c04a015f453 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Harsh,

I will give it a go as you suggested.=A0
But I feel it's not convenient in my case. Decommission is for= taking down a node. What I am doing here is taking out a dir. In my case, = all I need to do is copy files in the dir I want to remove to remaining dir= s on the node, isn't it?

Why not hadoop has this functionality?


On Mon, Apr 4, 2011 at 5:05 PM, Har= sh Chouraria <ha= rsh@cloudera.com> wrote:
Hello Elton,

On Mon, Apr 4, 2011 at 11:44 AM, elton sky <eltonsky9404@gmail.com> wrote:
> Now I want to remove 1 disk from each node, say /data4/hdfs-data. What= I
> should do to keep data integrity?
> -Elton

This can be done using the reliable 'decommission' process, b= y
recommissioning them after having reconfigured (multiple nodes may be
taken down per decommission round this way, but be wary of your
cluster's actual used data capacity, and your minimum replication
factors). Read more about the decommission processes here:
http://hadoop.apache.org/hdfs/docs/r0.2= 1.0/hdfs_user_guide.html#DFSAdmin+Command
and http://developer.yahoo.com/hadoop/tutorial/modu= le2.html#decommission

You may also have to run a cluster-wide balancer of DNs after the
entire process is done, to get rid of some skew in the distribution of
data across them.

(P.s. As an alternative solution, you may bring down one DataNode at a
time, reconfigure it individually, and bring it up again; then repeat
with the next one once NN's fsck reports a healthy situation again (no<= br> under-replicated blocks). But decommissioning is the guaranteed safe
way and is easier to do for some bulk of nodes.)

--
Harsh J
Support Engineer, Cloudera

--00151758a392b6628c04a015f453--