cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ballet <jbal...@edgelab.ch>
Subject Many keyspaces pattern
Date Tue, 24 Nov 2015 10:05:30 GMT
Hi,

we are running an application which produces every night a batch with 
several hundreds of Gigabytes of data. Once a batch has been computed, 
it is never modified (nor updates nor deletes), we just keep producing 
new batches every day.

Now, we are *sometimes* interested to remove a complete specific batch 
altogether. At the moment, we are accumulating all these data into only 
one keyspace which has a batch ID column in all our tables which is also 
part of the primary key. A sample table looks similar to this:

   CREATE TABLE computation_results (
       batch_id int,
       id1 int,
       id2 int,
       value double,
       PRIMARY KEY ((batch_id, id1), id2)
   ) WITH CLUSTERING ORDER BY (id2 ASC);

But we found out it is very difficult to remove a specific batch as we 
need to know all the IDs to delete the entries and it's both time and 
resource consuming (ie. it takes a long time and I'm not sure it's going 
to scale at all.)

So, we are currently looking into having each of our batches in a 
keyspace of their own so removing a batch is merely equivalent to delete 
a keyspace. Potentially, it means we will end up having several hundreds 
of keyspaces in one cluster, although most of the time only the very 
last one will be used (we might still want to access the older ones, but 
that would be a very seldom use-case.) At the moment, the keyspace has 
about 14 tables and is probably not going to evolve much.


Are there any counter-indications of using lot of keyspaces (300+) into 
one Cassandra cluster?
Are there any good practices that we should follow?
After reading the "Anti-patterns in Cassandra > Too many keyspaces or 
tables", does it mean we should plan ahead to already split our keyspace 
among several clusters?

Finally, would there be any other way to achieve what we want to do?

Thanks for your help!

  Jonathan

Mime
View raw message