cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Dejanovski <>
Subject Re: vnodes: high availability
Date Mon, 15 Jan 2018 16:55:40 GMT
Hi Kyrylo,

the situation is a bit more nuanced than shown by the Datastax diagram,
which is fairly theoretical.
If you're using SimpleStrategy, there is no rack awareness. Since vnode
distribution is purely random, and the replica for a vnode will be placed
on the node that owns the next vnode in token order (yeah, that's not easy
to formulate), you end up with statistics only.

I kinda suck at maths but I'm going to risk making a fool of myself :)

The odds for one vnode to be replicated on another node are, in your case,
2/49 (out of 49 remaining nodes, 2 replicas need to be placed).
Given you have 256 vnodes, the odds for at least one vnode of a single node
to exist on another one is 256*(2/49) = 10.4%
Since the relationship is bi-directional (there are the same odds for node
B to have a vnode replicated on node A than the opposite), that doubles the
odds of 2 nodes being both replica for at least one vnode : 20.8%.

Having a smaller number of vnodes will decrease the odds, just as having
more nodes in the cluster.
(now once again, I hope my maths aren't fully wrong, I'm pretty rusty in
that area...)

How many queries that will affect is a different question as it depends on
which partition currently exist and are queried in the unavailable token

Then you have rack awareness that comes with NetworkTopologyStrategy :
If the number of replicas (3 in your case) is proportional to the number of
racks, Cassandra will spread replicas in different ones.
In that situation, you can theoretically lose as many nodes as you want in
a single rack, you will still have two other replicas available to satisfy
quorum in the remaining racks.
If you start losing nodes in different racks, we're back to doing maths
(but the odds will get slightly different).

That makes maintenance predictable because you can shut down as many nodes
as you want in a single rack without losing QUORUM.

Feel free to correct my numbers if I'm wrong.


On Mon, Jan 15, 2018 at 5:27 PM Kyrylo Lebediev <>

> Thanks, Rahul.
> But in your example, at the same time loss of Node3 and Node6 leads to
> loss of ranges N, C, J at consistency level QUORUM.
> As far as I understand in case vnodes > N_nodes_in_cluster and
> endpoint_snitch=SimpleSnitch, since:
> 1) "secondary" replicas are placed on two nodes 'next' to the node
> responsible for a range (in case of RF=3)
> 2) there are a lot of vnodes on each node
> 3) ranges are evenly distribusted between vnodes in case of SimpleSnitch,
> we get all physical nodes (servers) having mutually adjacent  token rages.
> Is it correct?
> At least in case of my real-world ~50-nodes cluster with nvodes=256, RF=3
> for this command:
> nodetool ring | grep '^<ip-prefix>' | awk '{print $1}' | uniq | grep -B2
> -A2 '<ip_of_a_node>' | grep -v '<ip_of_a_node>' | grep -v '^--' | sort |
> uniq | wc -l
> returned number which equals to Nnodes -1, what means that I can't switch
> off 2 nodes at the same time w/o losing of some keyrange for CL=QUORUM.
> Thanks,
> Kyrill
> ------------------------------
> *From:* Rahul Neelakantan <>
> *Sent:* Monday, January 15, 2018 5:20:20 PM
> *To:*
> *Subject:* Re: vnodes: high availability
> Not necessarily. It depends on how the token ranges for the vNodes are
> assigned to them. For example take a look at this diagram
> In the vNode part of the diagram, you will see that Loss of Node 3 and
> Node 6, will still not have any effect on Token Range A. But yes if you
> lose two nodes that both have Token Range A assigned to them (Say Node 1
> and Node 2), you will have unavailability with your specified configuration.
> You can sort of circumvent this by using the DataStax Java Driver and
> having the client recognize a degraded cluster and operate temporarily in
> downgraded consistency mode
> - Rahul
> On Mon, Jan 15, 2018 at 10:04 AM, Kyrylo Lebediev <
>> wrote:
> Hi,
> Let's say we have a C* cluster with following parameters:
>  - 50 nodes in the cluster
>  - RF=3
>  - vnodes=256 per node
>  - CL for some queries = QUORUM
>  - endpoint_snitch = SimpleSnitch
> Is it correct that 2 any nodes down will cause unavailability of a
> keyrange at CL=QUORUM?
> Regards,
> Kyrill

Alexander Dejanovski

Apache Cassandra Consulting

View raw message