cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Jordan <jeremiah.jor...@morningstar.com>
Subject Re: DataCenters each with their own local data source
Date Wed, 23 Nov 2011 02:51:05 GMT
Oops, I was thinking all in the same keyspace.  If you made a new keyspace for each DC you
could specify where to put the data and have them only be in one place.

-Jeremiah

On Nov 22, 2011, at 8:49 PM, Jeremiah Jordan wrote:

> Cassandra's Multiple Data Center Support is meant for replicating all data across multiple
datacenter's efficiently.
> 
> You could use the Byte Order Partitioner to prefix data with a key and assign those keys
to nodes in specific data centers, though the edge nodes would get tricky as those would want
to have replicas in other data centers, you could probably do some stuff with sentinel values,
and some nodes that only replicate data and aren't the primary node for any data to make this
not happen.
> 
> It is doable, though this would probably be more trouble then it is worth.  I would probably
just make each DC its own cluster and have client logic which knows which DC to query.
> 
> -Jeremiah
> 
> On Nov 22, 2011, at 6:57 PM, Mathieu Lalonde wrote:
> 
>> 
>> 
>> Hi,
>> 
>> I am wondering if Cassandra's features and datacenter awareness can help me with
my scalability problems.
>> 
>> Suppose that I have a 10-20 Data centers, each with their own local (massive) source
of time series data.  I would like:
>> - to avoid replication across data centers (this seems doable based on: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-KeySpaces-for-different-nodes-in-the-same-ring-td5096393.html#a5096568
)
>> - writes for local data to be done on the local data center (not sure about that
one)
>> - reads from a master data center to any remote data centers (not sure about that
one either)
>> 
>> It sounds like I am trying to use Cassandra in a very different way that it was intended
to be used.
>> Should I simply have a middle-tier that takes care of distributing reads to multiple
data centers and treat each data center as its own autonomous cluster?
>> 
>> Thanks!
>> Matt
>> 
>> 		 	   		  
> 


Mime
View raw message