incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Active / Active Data Center and RF
Date Mon, 21 Mar 2011 08:19:30 GMT
I'll take another crack at it, here's how I think it works.

When using the NetworkTopologyStrategy you can specify how the RF is distributed between the
DC's you have. This is done as part of the schema definition. When using a CLI script use
the strategy_options clause of the create keyspace statement, it is also available via the
yaml configuration and the RPC. 

For example you can split an RF of 6 evenly over two DC's or say 4 replicas in one and 2 in
another. You can slice it up anyway you want.

When using the NetworkTopologyStrategy you will want to use the PropertyFileSnitch (set in
yaml config). This reads the conf/ file to find out which DC
and which rack a node is in. The old way is the RackInferringSnitch. 

So no matter which DC the co-ordinator is in, the cluster will try to place replicas according
to these configuration settings. The LOCAL_QUORUM and EACH_QUORUM CL's tell the coordinator
to block on either just the local mutations or the local and remote from each DC.

The settings will be used when a read is performed to determine where the replicas are. In
the read case the PropertyFileSnitch will sort the (live) end points by proximity to the coordinator
node, this takes into account both the rack and the datacentre. The request will only be sent
to the number of nodes we are going to block on, with the closest nodes chosen first. If you
read at LOCAL_QUORUM and everything is working your read will only use nodes in the local
DC, using the DC's local RF. If you read at QUORUM you would use the full clusters RF and
the read would potentially cross DC's.

For both LOCAL_QUORUM and EACH_QUORUM the read blocks until RF nodes for the local DC have
returned. (The remote DC RF settings are ignored, anyone know why?)

Hope that helps. 
On 21 Mar 2011, at 16:43, mcasandra wrote:

> CL is just a way to satisfy consistency but you still want majority of your
> reads (preferrably) occurring in the same DC.
> I don't think that answers my question at all. I understand the CL but I
> think I have more basic and important question about active/active data
> center and the replicas in that very specific scenario which to me looks
> like a issue somehow. Can someone please look at my question specifically
> again?
> --
> View this message in context:
> Sent from the mailing list archive at

View raw message