hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artem Ervits <artemerv...@gmail.com>
Subject Re: Adding datanodes to Hadoop cluster - Will data redistribute?
Date Sat, 07 Feb 2015 14:17:46 GMT
Look at hdfs balancer

Artem Ervits
On Feb 6, 2015 5:54 PM, "Manoj Venkatesh" <manovenki@gmail.com> wrote:

> Dear Hadoop experts,
>
> I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation
> and 2 additional nodes were added later to increase disk and CPU capacity.
> What i see is that processing is shared amongst all the nodes whereas the
> storage is reaching capacity on the original 6 nodes whereas the newly
> added machines have relatively large amount of storage still unoccupied.
>
> I was wondering if there is an automated or any way of redistributing data
> so that all the nodes are equally utilized. I have checked for the
> configuration parameter - *dfs.datanode.fsdataset.volume.choosing.policy*
> have options 'Round Robin' or 'Available Space', are there any other
> configurations which need to be reviewed.
>
> Thanks,
> Manoj
>

Mime
View raw message