hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomasz Chmielewski <man...@wpkg.org>
Subject rebalancing data on disks?
Date Mon, 31 Oct 2011 13:49:37 GMT

I have a HDFS cluster consisting of several hosts.

On each node, I add a new disk when the current capacity is close to full.

Right now, every server has more or less such distribution of data:

/dev/sdf              493G  468G   51M 100% /data1
/dev/sdg              493G  468G   51M 100% /data2
/dev/sdh              493G  103G  365G  22% /data3
/dev/sdi              493G  100G  368G  22% /data4

So, /dev/sdf and /dev/sdg almost 100% full, and there is lots of free 
space on /dev/sdh and /dev/sdi.

Disks which are 100% full don't make monitoring very happy.

Is it possible to rebalance data on the disks on one HDFS server (or, 
more servers)?

"hadoop balancer" will want to rebalance data between the servers, but 
not between the disks on one server.

Tomasz Chmielewski

View raw message