cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kurt greaves <>
Subject Re: 回复: tolerate how many nodes down in the cluster
Date Mon, 24 Jul 2017 22:27:12 GMT
I've never really understood why Datastax recommends against racks. In
those docs they make it out to be much more difficult than it actually is
to configure and manage racks.

The important thing to keep in mind when using racks is that your # of
racks should be equal to your RF. If you have keyspaces with different RF,
then it's best to have the same # as the RF of your most important
keyspace, but in this scenario you lose some of the benefits of using racks.

As Anuj has described, if you use RF # of racks, you *can* lose up to an
entire rack without losing availability. Note that this entirely depends on
the situation. *When you take a node down, the other nodes in the cluster
require capacity to be able to handle the extra load that node is no longer
handling. *What this means is that if your cluster will require the other
nodes to store hints for that node (equivalent to the amount of writes made
to that node), and also handle its portion of READs. You can only take out
as many nodes from a rack as the capacity of your cluster allows.

I also strongly disagree that using racks makes operations tougher. If
anything, it makes them considerably easier (especially when using vnodes).
The only difficulty is the initial setup of racks, but for all the possible
benefits it's certainly worth it. As well as the fact that you can lose up
to an entire rack (great for AWS AZ's) without affecting availability,
using racks also makes operations on large clusters much smoother. For
example, when upgrading a cluster, you can now do it a rack at a time, or
some portion of a rack at a time. Same for OS upgrades or any other
operation that could happen in your environment. This is important if you
have lots of nodes.  Also it makes coordinating repairs easier, as you now
only need to repair a single rack to ensure you've repaired all the data.
Basically any operation/problem where you need to consider the distribution
of data, racks are going to help you.

View raw message