incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Lalonde <>
Subject RE: DataCenters each with their own local data source
Date Wed, 23 Nov 2011 02:15:20 GMT


Thanks for the quick reply.  Sorry if my question was not clear.  I tried to provide more

> Date: Tue, 22 Nov 2011 20:43:33 -0500 
> Subject: Re: DataCenters each with their own local data source 
> From: 
> To: 
> Distributing writes to all D.C.s? or reads?

Writes should be distributed within the local D.C.
Reads could be D.C. specific, or could ask a question to all D.C. instances.

> If each D.C. has data specific to that particular geo, why do you have  
> to read from remote D.C. ? 

Users want to aggregate information from different D.C.

> You can easily incorporate logic to re-direct operation(either  
> write/read) to appropriate(local) D.C.

In the middle tier, or does Cassandra have explicit support for that?

> Still wondering why you want to do so?. Am assuming you want to store  
> data as per I.P. (geo location data) ..Anyways, it was not very clear  
> from your question what you are trying to do. 

Each data center has a local data source that we can't afford to replicate across data centers.
Users may be interested in querying multiple data centers.

> Thanks, 
> Jahangir Mohammed. 
> On Tue, Nov 22, 2011 at 7:57 PM, Mathieu Lalonde  
> <<>> wrote: 
> Hi, 
> I am wondering if Cassandra's features and datacenter awareness can  
> help me with my scalability problems. 
> Suppose that I have a 10-20 Data centers, each with their own local  
> (massive) source of time series data.  I would like: 
> - to avoid replication across data centers (this seems doable based on:  
> ) 
> - writes for local data to be done on the local data center (not sure  
> about that one) 
> - reads from a master data center to any remote data centers (not sure  
> about that one either) 
> It sounds like I am trying to use Cassandra in a very different way  
> that it was intended to be used. 
> Should I simply have a middle-tier that takes care of distributing  
> reads to multiple data centers and treat each data center as its own  
> autonomous cluster? 
> Thanks! 
> Matt 
View raw message