incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Replica data distributing between racks
Date Mon, 02 May 2011 19:18:22 GMT
That appears to be working correctly, but does not sound great. 

When the NTS selects replicas in a DC it orders the tokens available in  the DC, then (in
the first pass) iterates through them placing a replica in each unique rack.  e.g. if the
RF in each DC was 2, the replicas would be put on 2 unique racks if possible. So the lowest
token in the DC will *always* get a write.

It's not possible to load balance between the racks as there is no state shared between requests.
A possible alternative would be to find the nearest token to the key and start allocating
replicas from there. But as each DC contains only a part (say half) of the token range the
likelihood is that half of the keys would match to either end of the DC's range so it would
not be a great solution. 

I think what you are trying to achieve is not possible. Do you have the capacity to run RF
2 in each DC ? That would at least even things out.


On 3 May 2011, at 06:40, Eric tamme wrote:

> I am experiencing an issue where replication is not being distributed
> between racks when using PropertyFileSnitch in conjunction with
> NetworkTopologyStrategy.
> I am running 0.7.3 from a tar.gz on
> I have 4 nodes, 2 data centers, and 2 racks in each data center.  Each
> rack has 1 node.
> I have even token distribution so that each node gets 25%:
> 0
> 425352958651173079329218259289
> 71026432
> 85070591730234615865843651857942052864
> 127605887595351923798765477786913079296
> My is as follows:
> # Cassandra Node IP=Data Center:Rack
> ffff\:0\:ffff\:eeee\:\:fffe=NY1:RAC1
> ffff\:0\:ffff\:eeee\:\:ffff=NY1:RAC2
> ffff\:0\:ffff\:ffff\:\:fffe=LA1:RAC1
> ffff\:0\:ffff\:ffff\:\:ffff=LA1:RAC2
> # default for unknown nodes
> default=NY1:RAC1
> My Keyspace replication strategy is as follows:
> Keyspace: SipTrace:
>  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>    Options: [LA1:1,NY1:1]
> So each data center should get 1 copy of the data, and this does
> happen.  The problem is that the replicated copies get pinned to the
> first host configured in the properties file, from what I can discern,
> and DO NOT distribute between racks.  So I have 2 nodes that have a 4
> to 1 ratio of data compared to the other 2 nodes.  This is a problem!
> Can any one please tell me if I have misconfigured this?  Or how I can
> get replica data to distribute evenly between racks within a
> datacenter?  I was led to believe that cassandra will try to
> distribute between racks for replica data automatically under this
> setup.
> Thank you for your help in advance!
> -Eric

View raw message