cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stump <>
Subject Re: Pyramid Organization of Data
Date Fri, 08 Apr 2011 20:43:51 GMT
A few lines of Java in a partitioning or rack aware strategy might be able to achieve this.


Typed with big fingers on a small keyboard. 

On Apr 8, 2011, at 13:17, Patrick Julien <> wrote:

> We have a pilot project running where all our historical data
> worldwide would be stored using cassandra.  So far, we have been
> successful at getting the write and read throughput we need, in fact,
> coming in over 27% over our needed capacity and well beyond what we
> were able to achieve with mysql, very impressive.
> However, one thing that escapes me is how we should organize different
> data center access.
> The scenario is the following:
> - We have data centers in North America, London, Tokyo and so on.
> - The relative cost of data centers is very different, e.g., TCO for
> one server in Tokyo is about the same than 5 such computers in New
> York.
> - We want to have access to all the data from North America, hence we
> would run Hadoop/Pig queries from the New York/North America data
> center only.
> The problem is this: we would like the historical data from Tokyo to
> stay in Tokyo and only be replicated to New York.  The one in London
> to be in London and only be replicated to New York and so on for all
> data centers.
> Is this currently possible with Cassandra?  I believe we would need to
> run multiple clusters and migrate data manually from data centers to
> North America to achieve this.  Also, any suggestions would also be
> welcomed.

View raw message