Strictly speaking, your math makes the assumption that the = failure of different nodes are probabilistically independent events. This i= s, of course, not a accurate assumption for real world conditions.=A0 Nodes= share racks, networking equipment, power, availability zones, data centers= , etc.=A0 So, I think the mathematical assertion is not quite as strong as = one would like, but it's certainly a good argument for handling certain= types of node failures.

On Fri, Dec 7= , 2012 at 11:27 AM, Nicolas Favre-Felix = wrote:
Hi Eric,

Your concerns ar= e perfectly valid.

We (Acunu) led the design and i= mplementation of this feature and spent a long time looking at the impact o= f such a large change.
We summarized some of our notes and wrote about the impact of virtual = nodes on cluster uptime a few months back:=A0http://www.acunu.com/2/post/2012/10/improving-cassandras-uptim= e-with-virtual-nodes.html.
The main argument in this blog post is that you only have a failure to= perform quorum read/writes if at least RF replicas fail within the time it= takes to rebuild the first dead node.=A0We show that virtual nodes actuall= y decrease the probability of failure, by streaming data from all nodes and= thereby improving the rebuild time.

Regards,

Nicolas

On = Wed, Dec 5, 2012 at 4:45 PM, Eric Parusel <ericparusel@gmail.com&g= t; wrote:
Hi all,

I've been won= dering about virtual nodes and how cluster uptime might change as cluster s= ize increases.

I understand clusters will benefit from increased relia= bility due to faster rebuild time, but does that hold true for large cluste= rs?

It seems that since (and correct me if I'm wrong he= re) every physical node will likely share some small amount of data with ev= ery other node, that as the count of physical nodes in a Cassandra cluster = increases (let's say into the triple digits) that the probability of at= least one failure to Quorum read/write=A0occurring=A0in a given time perio= d=A0would *increase*. =A0

Would this hold true, at least until physical nodes bec= omes greater than num_tokens per node?

I under= stand that the window of failure for affected ranges would probably be smal= l but we do Quorum reads of many keys, so we'd likely hit every virtual= range with our queries, even if num_tokens was 256.

Thanks,
Eric

--
Tyler Hobbs
DataStax
<= br>
--f46d040715fddc454404d06f1434--