incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: replication_factor: ?
Date Fri, 14 Mar 2014 23:18:43 GMT
On Fri, Mar 7, 2014 at 2:01 PM, Donald Smith <
Donald.Smith@audiencescience.com> wrote:

>  Robert, please elaborate why you say "To make best use of Cassandra, my
> minimum recommendation is usually RF=3, N=6."
>
>
> I surmise that with any less than 6 nodes, you'd likely perform better
> with a sequential/single-node solution.  You need at least six nodes to
> overcome the overheads from concurrency.  But that's a vague explanation.
>

Briefly :

1) With a RF of less than 3, you are unable to meaningfully use the QUORUM
ConsistencyLevel.

2) With a RF of less than 3, edge cases with potential of data loss are
significantly more likely.

3) With a RF of less than 3, losing a single node means losing at least 50%
of the capacity of your cluster for that range.

4) With a RF of less than 3, two replica nodes happening to Java GC at the
same time means a range is unavailable.

5) With a N of less than 6, losing a single node means losing a significant
percentage of total cluster capacity. Still-live nodes share the read and
write load of the lost node, as well as sharing the overhead of creating
its replacement.

The "real" minimum minimum to use QUORUM in production is probably RF=3,
N=4 or 5. But if you are provisioning correctly, such that your nodes have
some but not excessive headroom, N of less than 6 makes losing and
replacing a node relatively expensive from a total cluster capacity
perspective.

=Rob

Mime
View raw message