incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: Replication Factor question
Date Mon, 14 Apr 2014 18:30:09 GMT
On Mon, Apr 14, 2014 at 2:25 AM, Markus Jais <markus.jais@yahoo.de> wrote:

> "It is generally not recommended to set a replication factor of 3 if you
> have fewer than six nodes in a data center".
>

I have a detailed post about this somewhere in the archives of this list
(which I can't seem to find right now..) but briefly, the "6-for-3" advice
relates to the percentage of capacity you have remaining when you have a
node down. It has become slightly less accurate over time because vnodes
reduce bootstrap time and there have been other improvements to node
startup time.

If you have fewer than 6 nodes with RF=3, you lose >1/6th of capacity when
you lose a single node, which is a significant percentage of total cluster
capacity. You then lose another meaningful percentage of your capacity when
your existing nodes participate in rebuilding the missing node. If you are
then unlucky enough to lose another node, you are missing a very
significant percentage of your cluster capacity and have to use a
relatively small fraction of it to rebuild the now two down nodes.

I wouldn't generalize the rule of thumb as "don't run under N=RF*2", but
rather as "probably don't run RF=3 under about 6 nodes". IOW, in my view,
the most operationally sane initial number of nodes for RF=3 is likely
closer to 6 than 3.

=Rob

Mime
View raw message