incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Single Vs. Multiple Keyspaces
Date Thu, 19 Apr 2012 10:40:53 GMT
I would suggest you build one cluster, using all your nodes, and create one keyspace for all
users.

There are lots of reasons, here a few:

* many nodes in a single clusters spreads the load and gives you fault tolerance. 
* read and write requests can be distributed in a many node cluster.
* cassandra caches and os level file caches will shared
* cassandra does not suffer from locking and contention during reads and writes
* you can prefix row keys to create "virtual keyspaces"  

Hope that helps. 

Aaron

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/04/2012, at 4:33 AM, Trevor Francis wrote:

> We are launching a data-intensive application that will store in upwards of 50 million
150-byte records per day per user. We have identified Cassandra as our database technology
and Flume as what we will use to seed the data from log files into the database. 
> 
> Each user is given their own server instance, but the schema of the data for each user
will be the same.
> 
> We will be performing realtime analysis on this information as part of our application
and was considering the advantages/disadvantages of all users using the same keyspace. All
data will be treated the same as far as replication factor and the only difference is we won't
be displaying one user's info to another user. They will be compartmentalized and one user's
data will not affect or ever be compared against another user.
> 
> Conceptualize this as a each user has their own Apache server and that server spits out
50 million records per day and each user will only be analyzing the data for their particular
server, not anyone elses. The log formats are exactly the same.
> 
> My experience lies in relational databases and not key-value stores, like Cassandra.
So, in the mysql world we would put each user in their own database to avoid the locking contention
and to make queries faster. 
> 
> If we don't post info into different keyspaces, i assume we will have to add an additional
field to our records to identify the user that owns that particular record. How does a single
large Keyspace affect query speed, etc. etc.
> 
> 
> 
> Trevor Francis
> 
> 


Mime
View raw message