incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Jordan <jeremiah.jor...@morningstar.com>
Subject Re: DataCenters each with their own local data source
Date Wed, 23 Nov 2011 02:49:02 GMT
Cassandra's Multiple Data Center Support is meant for replicating all data across multiple
datacenter's efficiently.

You could use the Byte Order Partitioner to prefix data with a key and assign those keys to
nodes in specific data centers, though the edge nodes would get tricky as those would want
to have replicas in other data centers, you could probably do some stuff with sentinel values,
and some nodes that only replicate data and aren't the primary node for any data to make this
not happen.

It is doable, though this would probably be more trouble then it is worth.  I would probably
just make each DC its own cluster and have client logic which knows which DC to query.

-Jeremiah

On Nov 22, 2011, at 6:57 PM, Mathieu Lalonde wrote:

> 
> 
> Hi,
> 
> I am wondering if Cassandra's features and datacenter awareness can help me with my scalability
problems.
> 
> Suppose that I have a 10-20 Data centers, each with their own local (massive) source
of time series data.  I would like:
> - to avoid replication across data centers (this seems doable based on: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-KeySpaces-for-different-nodes-in-the-same-ring-td5096393.html#a5096568
)
> - writes for local data to be done on the local data center (not sure about that one)
> - reads from a master data center to any remote data centers (not sure about that one
either)
> 
> It sounds like I am trying to use Cassandra in a very different way that it was intended
to be used.
> Should I simply have a middle-tier that takes care of distributing reads to multiple
data centers and treat each data center as its own autonomous cluster?
> 
> Thanks!
> Matt
> 
> 		 	   		  


Mime
View raw message