hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Parashar <para.vi...@gmail.com>
Subject Re: Adding datanodes to Hadoop cluster - Will data redistribute?
Date Sat, 07 Feb 2015 04:37:09 GMT
Hi Manoj,

Pls try

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer


Rg:
Vikas Parashar (Vicky)

On Sat, Feb 7, 2015 at 1:04 AM, Manoj Venkatesh <manovenki@gmail.com> wrote:

> Dear Hadoop experts,
>
> I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation
> and 2 additional nodes were added later to increase disk and CPU capacity.
> What i see is that processing is shared amongst all the nodes whereas the
> storage is reaching capacity on the original 6 nodes whereas the newly
> added machines have relatively large amount of storage still unoccupied.
>
> I was wondering if there is an automated or any way of redistributing data
> so that all the nodes are equally utilized. I have checked for the
> configuration parameter - *dfs.datanode.fsdataset.volume.choosing.policy*
> have options 'Round Robin' or 'Available Space', are there any other
> configurations which need to be reviewed.
>
> Thanks,
> Manoj
>

Mime
View raw message