Thanks for your thoughts guys.
I agree that with vnodes total downtime is lessened. Although it also seems that the total number of outages (however small) would be greater.
But I think downtime is only lessened up to a certain cluster size.
I'm thinking that as the cluster continues to grow:
- node rebuild time will max out (a single node only has so much write bandwidth)
- the probability of 2 nodes being down at any given time will continue to increase -- even if you consider only non-correlated failures.
Therefore, when adding nodes beyond the point where node rebuild time maxes out, both the total number of outages *and* overall downtime would increase?