hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Venkatesh <manove...@gmail.com>
Subject Adding datanodes to Hadoop cluster - Will data redistribute?
Date Fri, 06 Feb 2015 19:34:43 GMT
Dear Hadoop experts,

I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation
and 2 additional nodes were added later to increase disk and CPU capacity.
What i see is that processing is shared amongst all the nodes whereas the
storage is reaching capacity on the original 6 nodes whereas the newly
added machines have relatively large amount of storage still unoccupied.

I was wondering if there is an automated or any way of redistributing data
so that all the nodes are equally utilized. I have checked for the
configuration parameter - *dfs.datanode.fsdataset.volume.choosing.policy*
have options 'Round Robin' or 'Available Space', are there any other
configurations which need to be reviewed.

Thanks,
Manoj

Mime
View raw message