First confirm if new nodes are added into cluster or not. You can use "hadoop dfsadmin -report" command to check per node hdfs usage.
If new nodes are listed in this command then you can run hadoop balancer to manually redistribute some of the data.


On 07-Feb-2015 4:24 AM, "Manoj Venkatesh" <> wrote:
Dear Hadoop experts,

I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation and 2 additional nodes were added later to increase disk and CPU capacity. What i see is that processing is shared amongst all the nodes whereas the storage is reaching capacity on the original 6 nodes whereas the newly added machines have relatively large amount of storage still unoccupied.

I was wondering if there is an automated or any way of redistributing data so that all the nodes are equally utilized. I have checked for the configuration parameter - dfs.datanode.fsdataset.volume.choosing.policy have options 'Round Robin' or 'Available Space', are there any other configurations which need to be reviewed.