incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maki Watanabe <>
Subject Re: Replicating to all nodes
Date Thu, 14 Jul 2011 04:07:18 GMT
Consistency and Availability are in trade-off each other.
If you use RF=7 + CL=ONE, your read/write will success if you have one
node alive during replicate data to 7 nodes.
Of course you will have a chance to read old data in this case.
If you need strong consistency, you must use CL=QUORUM.


2011/7/14 Kyle Gibson <>:
> Thanks for the reply Peter.
> The goal is to configure a cluster in which reads and writes can
> complete successfully even if only 1 node is online. For this to work,
> each node would need the entire dataset. Your example of a 3 node ring
> with RF=3 would satisfy this requirement. However, if two nodes are
> offline, CL.QUORUM would not work, I would need to use CL.ONE. If all
> 3 nodes are online, CL.ONE is undershooting, I would want to use
> CL.QUORUM (or maybe CL.ALL). Or does CL.ONE actually function this
> way, somewhat?
> A complication occurs when you want to add another node. Now there's a
> 4 node ring, but only 3 replicas, so each node isn't guaranteed to
> have all of the data, so the cluster can't completely function when
> N-1 nodes are offline. So this is why I would like to have the RF
> scale relative to the size of the cluster. Am I mistaken?
> Thanks!
> On Wed, Jul 13, 2011 at 6:41 PM, Peter Schuller
> <> wrote:
>>> Read and write operations should succeed even if only 1 node is online.
>>> When a read is performed, it is performed against all active nodes.
>> Using QUORUM is the closest thing you get for reads without modifying
>> Cassandra. You can't make it wait for all nodes that happen to be up.
>>> When a write is performed, it is performed against all active nodes,
>>> inactive/offline nodes are updated when they come back online.
>> Writes always go to all nodes that are up, but if you want to wait for
>> them before returning "OK" to the client than no - except CL.ALL
>> (which means you don't survive one being down) and CL.QUORUM (which
>> means you don't wait for all if all are up).
>>> I don't believe it does. Currently the replication factor is hard
>>> coded based on key space, not a function of the number of nodes in the
>>> cluster. You could say, if N = 7, configure replication factor = 7,
>>> but then if only 6 nodes are online, writes would fail. Is this
>>> correct?
>> No. Reads/write fail according to the consistency level. The RF +
>> consistency level tells how many nodes must be up and successfully
>> service the request in order for the operation to succeed. RF just
>> tells you the number of total nodes int he replicate set for a key;
>> whether an operation fails is up to the consistency level.
>> I would ask: Why are you trying to do this? It really seems you're
>> trying to do the "wrong" thing. Why would you ever want to replicate
>> to all? If you want 3 copies in total, then do RF=3 and keep a 3 node
>> ring. If you need more capacity, you add nodes and retain RF. If you
>> need more redundancy, you have to increase RF. Those are two very
>> different axis along which to scale. I cannot think of any reason why
>> you would want to tie RF to the total number of nodes.
>> What is the goal you're trying to achieve?
>> --
>> / Peter Schuller (@scode on twitter)


View raw message