First confirm if new nodes are added into cluster or not. You can use "hadoop dfsadmin -report" command to check per node hdfs usage.
If new nodes are listed in this command then you can run hadoop balancer to manually redistribute some of the data.
ManojThanks,Dear Hadoop experts,I have a Hadoop cluster of 8 nodes, 6 were added during cluster creation and 2 additional nodes were added later to increase disk and CPU capacity. What i see is that processing is shared amongst all the nodes whereas the storage is reaching capacity on the original 6 nodes whereas the newly added machines have relatively large amount of storage still unoccupied.
I was wondering if there is an automated or any way of redistributing data so that all the nodes are equally utilized. I have checked for the configuration parameter - dfs.datanode.fsdataset.volume.choosing.policy have options 'Round Robin' or 'Available Space', are there any other configurations which need to be reviewed.