hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akira AJISAKA <ajisa...@oss.nttdata.co.jp>
Subject Re: Adding datanodes to Hadoop cluster - Will data redistribute?
Date Sat, 07 Feb 2015 01:32:54 GMT
Hi Manoj,

You need to use balancer to re-balance data between nodes.
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer

 > *dfs.datanode.fsdataset.volume.choosing.policy* have options 'Round
 > Robin' or 'Available Space', are there any other configurations which
 > need to be reviewed.
The option is for the disks in a node.

Regards,
Akira

On 2/6/15 11:34, Manoj Venkatesh wrote:
> Dear Hadoop experts,
>
> I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation
> and 2 additional nodes were added later to increase disk and CPU
> capacity. What i see is that processing is shared amongst all the nodes
> whereas the storage is reaching capacity on the original 6 nodes whereas
> the newly added machines have relatively large amount of storage still
> unoccupied.
>
> I was wondering if there is an automated or any way of redistributing
> data so that all the nodes are equally utilized. I have checked for the
> configuration parameter -
> *dfs.datanode.fsdataset.volume.choosing.policy* have options 'Round
> Robin' or 'Available Space', are there any other configurations which
> need to be reviewed.
>
> Thanks,
> Manoj


Mime
View raw message