hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Ginzburg <ginz...@hotmail.com>
Subject Adding new data nodes to existing cluster, with different storage capcity
Date Thu, 20 Jan 2011 08:42:17 GMT

Our current cluster runs with 22 data nodes - each with 4TB .
We should be installing new data nodes on this existing cluster , but each will have 8TB of
storage capacity.
I am wondering how will the namenode distribute the blocks, It is my understanding that Replica
Placement policy is that data nodes are chosen at random, so an even distribution
is expected , So eventually the smaller nodes
will fill up while the larger nodes will reach 50% at which point the small
nodes will become unusable. 
Am I correct? 
Is there any recommended practice in this case? would running a balancer periodically help?


View raw message