cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Parusel <>
Subject Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?
Date Tue, 11 Dec 2012 08:39:02 GMT
Thanks for your thoughts guys.

I agree that with vnodes total downtime is lessened.  Although it also
seems that the total number of outages (however small) would be greater.

But I think downtime is only lessened up to a certain cluster size.

I'm thinking that as the cluster continues to grow:
  - node rebuild time will max out (a single node only has so much write
  - the probability of 2 nodes being down at any given time will continue
to increase -- even if you consider only non-correlated failures.

Therefore, when adding nodes beyond the point where node rebuild time maxes
out, both the total number of outages *and* overall downtime would increase?


On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <>wrote:

> Assuming you need to work with quorum in a non-vnode scenario. That means
> that if 2 nodes in a row in the ring are down some number of quorum
> operations will fail with UnavailableException (TimeoutException right
> after the failures). This is because the for a given range of tokens quorum
> will be impossible, but quorum will be possible for others.
> In a vnode world if any two nodes are down,  then the intersection of
> vnode token ranges they have are unavailable.
> I think it is two sides of the same coin.
> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <> wrote:
>> Hi Tyler,
>> You're right, the math does assume independence which is unlikely to be
>> accurate.  But if you do have correlated failure modes e.g. same power,
>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>> features to ensure replicas are spread around so your cluster can survive
>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>> all scenarios, but haven't done the math to prove it.
>> Richard.

View raw message