Thanks for your thoughts guys.

I agree that w=
ith vnodes total downtime is lessened. =A0Although it also seems that the t=
otal number of outages (however small) would be greater.

But I think downtime is only lessened up to a certain cluster si=
ze.

I'm thinking that as the cluster continues=
to grow:

=A0 - node rebuild time will max out (a single nod=
e only has so much write bandwidth)

=A0 - the probability of 2 nodes being down at any given time will con=
tinue to increase -- even if you consider only non-correlated failures.

Therefore, when adding nodes beyond the point w=
here node rebuild time maxes out, both the total number of outages *and* ov=
erall downtime would increase?

Thanks,

Eric

On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo &=
lt;edlinuxguru@g=
mail.com> wrote:

Assuming you need to work with quorum i= n a non-vnode scenario. That means that if 2 nodes in a row in the ring are= down some number of quorum operations will fail with UnavailableException = (TimeoutException right after the failures). This is because the for a give= n range of tokens quorum will be impossible, but quorum will be possible fo= r others.In a vnode world if any two nodes are down, =A0then the= intersection of vnode token ranges they have are unavailable.=A0

I think it is two sides of the same coin.=A0

On Mon, Dec 10, 2012 at 7:41 AM, Richard Low=
<=
rlow@acunu.com> wrote:

Hi Tyler,You're right, the math does assume in= dependence which is unlikely to be accurate. =A0But if you do have correlat= ed failure modes e.g. same power, racks, DC, etc. then you can still use Ca= ssandra's rack-aware or DC-aware features to ensure replicas are spread= around so your cluster can survive the correlated failure mode. =A0So I wo= uld expect vnodes to improve uptime in all scenarios, but haven't done = the math to prove it.Richard.