hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patai Sangbutsarakum <silvianhad...@gmail.com>
Subject balance blocks between small and bigger disks in the same datanode.
Date Mon, 24 Oct 2011 19:09:59 GMT
Hi All,

I was looking into FAQ, but well still have questions.
Datanodes in my production are running low in the space of one of dfs.data.dir

/dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
/dev/sdb1             --> 484G   324G   161G  67% /hadoop2
/dev/sdc1                   484G   318G   167G  66% /hadoop3

/hadoop1 has smaller space since the very beginning because its drive
is being shared with operating system.
I found one FAQ in wiki page
"3.12. On an individual data node, how do you balance the blocks on the disk?

Hadoop currently does not have a method by which to do this
automatically. To do this manually:

1    Take down the HDFS
2   Use the UNIX mv command to move the individual blocks and meta
pairs from one directory to another on each host
3    Restart the HDFS "

Question of step 1, take down the hdfs.
does that mean the whole cluster OR just datanode process of a
datanode/tasktracker host?

Question of step 2,

2.1 "moving blk and meta pair."

are blk and meta pairs referring to

cd /hadoop1/data/current
$ ls -al *8816473533602921489*
-rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
-rw-rw-r-- 1 apps apps      63 Aug 27 21:03


2.2 "from one directory to another on each host"

does it needs to be like blk(and meta) from "current" has to be landed
to "current" directory of another dfs.data.dir
mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/

or it can be different directory name in destination side.

2.3 how about subdirXX?

under /hadoop1/data/current/
55G	subdir36
49G	subdir37

it is so tempting to move subdir36, subdir37 because they are huge.
should it look like

mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/

well... under /hadoop2/data/current/subdir36/
also have bunch of blk(and meta) and bunch of subdirectories as well
which mean if i do move, it might be some collide ?

Thanks in advances.

View raw message