We had the similar problem  multitenancy and multiple DC support. But we
did not really have strict requirement of one keyspace per tenant. Our row
keys allow us to put any number of tenants per keyspace.
So, on one side  we could put all data in a single keyspace for all
tenants. And size the cluster for it, at the end the total amount of data
would be the same :)
However, we wanted different replication strategy for different customers.
And the replication strategy is a keyspace setting. Thus, it wold be
simpler to have one keyspace per customer.
The cost, as it was mentioned, is per CF. The more keyspaces we have, the
more CFs we have. So we did not want this to be too high.
The decision we've made was to have something in between. We'd define a
number of keyspaces with different replication strategies (possibly even
duplicate ones) and map tenants to these keyspaces. Thus, there would be a
couple of tenants in one keyspace all sharing the same properties
(replication strategy in our case). We could even create a keyspace that
will group some tenants that currently share the same replication
requirements and that may be moved/replicated to a specific DC in the
future.
On Wed, Dec 3, 2014 at 4:54 PM, Raj N <raj.cassandra@gmail.com> wrote:
> The question is more from a multitenancy point of view. We wanted to see
> if we can have a keyspace per client. Each keyspace may have 50 column
> families, but if we have 200 clients, that would be 10,000 column families.
> Do you think that's reasonable to support? I know that key cache capacity
> is reserved in heap still. Any plans to move it offheap?
>
> Raj
>
> On Tue, Nov 25, 2014 at 3:10 PM, Robert Coli <rcoli@eventbrite.com> wrote:
>
>> On Tue, Nov 25, 2014 at 9:07 AM, Raj N <raj.cassandra@gmail.com> wrote:
>>
>>> What's the latest on the maximum number of keyspaces and/or tables that
>>> one can have in Cassandra 2.1.x?
>>>
>>
>> Most relevant changes lately would be :
>>
>> https://issues.apache.org/jira/browse/CASSANDRA6689
>> and
>> https://issues.apache.org/jira/browse/CASSANDRA6694
>>
>> Which should meaningfully reduce the amount of heap memtables consume.
>> That heap can then be used to support more heappersistent structures
>> associated with many CFs. I have no idea how to estimate the scale of the
>> improvement.
>>
>> As a general/meta statement, Cassandra is very multithreaded, and
>> consumes file handles like crazy. How many different query cases do you
>> really want to put on one cluster/node? ;D
>>
>> =Rob
>>
>>
>

Nikolai Grigoriev
(514) 7725178
