hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Ossama <ah...@aossama.com>
Subject Re: Adding datanodes to Hadoop cluster - Will data redistribute?
Date Sat, 07 Feb 2015 13:59:04 GMT
Hi,

Have you tried;

$ hdfs balancer

On 02/06/2015 09:34 PM, Manoj Venkatesh wrote:
> Dear Hadoop experts,
>
> I have a Hadoop cluster of 8 nodes, 6 were added during cluster 
> creation and 2 additional nodes were added later to increase disk and 
> CPU capacity. What i see is that processing is shared amongst all the 
> nodes whereas the storage is reaching capacity on the original 6 nodes 
> whereas the newly added machines have relatively large amount of 
> storage still unoccupied.
>
> I was wondering if there is an automated or any way of redistributing 
> data so that all the nodes are equally utilized. I have checked for 
> the configuration parameter - 
> *dfs.datanode.fsdataset.volume.choosing.policy* have options 'Round 
> Robin' or 'Available Space', are there any other configurations which 
> need to be reviewed.
>
> Thanks,
> Manoj

-- 
Regards,
Ahmed Ossama


Mime
View raw message