Joydeep Sen Sarma commented on HDFS1094:

one can't have a different nodegroup for each block/file. that would defeat the whole point.
(in fact  every block today is in a 3node nodegroup  and there are gazillions of such
node groups that overlap).
the reduction in data loss probability comes out of the fact that the odds of 3 nodes falling
into the same nodegroup is small. (if they don't fall into the same nodegroup  there's
no data loss).
if the number of node groups is very large (because of overlaps)  then the probability of
3 failing nodes falling into the same node group will start going up (just because there are
more nodegroups to choose from). the more the nodegroups are exclusive  the better. that
means the number of nodegroups is minimized wrt. a constant number of nodes. as i mentioned
 the size of the nodegroup is dictated to some extent by rereplication bandwidth. one wants
very small node groups  but that doesn't work because there's not enough rereplication bandwidth
(a familiar problem in RAID).
if u take some standard cluster (say 8 racks x 40 nodes)  how many distinct node groups would
ur algorithm end up with?
> Intelligent block placement policy to decrease probability of block loss
> 
>
> Key: HDFS1094
> URL: https://issues.apache.org/jira/browse/HDFS1094
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: dhruba borthakur
> Assignee: Rodrigo Schmidt
> Attachments: prob.pdf, prob.pdf
>
>
> The current HDFS implementation specifies that the first replica is local and the other
two replicas are on any two random nodes on a random remote rack. This means that if any three
datanodes die together, then there is a nontrivial probability of losing at least one block
in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability
of losing a block.

