cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jais <>
Subject Re: Replication Factor question
Date Wed, 16 Apr 2014 08:47:04 GMT
Hi Rob,

thanks. How many nodes to you have running in those 5 racks and RF 5? Only 5 nodes or more?


Robert Coli <> schrieb am 20:36 Dienstag, 15.April 2014:

On Tue, Apr 15, 2014 at 6:14 AM, Ken Hancock <> wrote:
>Keep in mind if you lose the wrong two, you can't satisfy quorum.  In a 5-node cluster
with RF=3, it would be impossible to lose 2 nodes without affecting quorum for at least some
of your data. In a 6 node cluster, once you've lost one node, if you were to lose another,
you only have a 1-in-5 chance of not affecting quorum for some of your data.
>This is why the real highly available way to run Cassandra with QUORUM is RF=5, with 5
>Briefly, any given node running a JVM based distributed application should be assumed
to potentially become transiently unavailable for a short time, for example during long GC
pauses or rolling restarts. There is also a chance of non-transient failure (hard down) at
any time, and a much smaller chance of two simultaneous non-transient failures. If you have
RF=3 and lose two nodes (one transient, the other non-transient) in a range, that range is
now unavailable because quorum is 2 and 3-2 is 1, which is less than 2. If you have RF=5 and
lose two nodes in the same way, quorum is 3 and 5-2 is 3, which is equal to 3.
>AFAICT, no one actually runs Cassandra in this way because keeping 5 copies of your already
denormalized data seems excessive and is difficult to justify to management.
View raw message