incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Modeling multi-tenanted Cassandra schema
Date Wed, 13 Nov 2013 14:35:39 GMT
Nate,

(slightly OT), what client API/library is recommended now that Hector is
sunsetting? Thanks.

Regards,
Shahab


On Wed, Nov 13, 2013 at 9:28 AM, Nate McCall <nate@thelastpickle.com> wrote:

> You basically want option (c). Option (d) might work, but you would be
> bending the paradigm a bit, IMO. Certainly do not use dedicated column
> families or keyspaces per tennant. That never works. The list history will
> show that with a few google searches and we've seen it fail badly with
> several clients.
>
> Overall, option (c) would be difficult to do in CQL without some very well
> thought out abstractions and/or a deep hack on the Java driver (not
> in-ellegant or impossible, just lots of moving parts to get your head
> around if you are new to such). That said, depending on the size of your
> project and skill of your team, this direction might be worth considering.
>
> Usergrid (just accepted for incubation at Apache) functions this way via
> the Thrift API: https://github.com/apigee/usergrid-stack
>
> The commercial version of Usergrid has "tens of thousands" of active
> tennants on a single cluster (same code base at the service layer as the
> open source version). It uses Hector's built in virtual keyspaces:
> https://github.com/hector-client/hector/wiki/Virtual-Keyspaces (NOTE:
> though Hector is sunsetting/in patch maintenance, the approach is certainly
> legitimate - but I'd recommend you *not* start a new project on Hector).
>
> In short, Usergrid is the only project I know of that has a well-proven
> tenant model that functions at scale, though I'm sure there are others
> around, just not open sourced or actually running large deployments.
>
> Astyanax can do this as well albeit with a little more work required:
>
> https://github.com/Netflix/astyanax/wiki/Composite-columns#how-to-use-the-prefixedserializer-but-you-really-should-use-composite-columns
>
>
> Happy to clarify any of the above.
>
>
> On Tue, Nov 12, 2013 at 3:19 AM, Ben Hood <0x6e6562@gmail.com> wrote:
>
>> Hi,
>>
>> I've just received a requirement to make a Cassandra app
>> multi-tenanted, where we'll have up to 100 tenants.
>>
>> Most of the tables are timestamped wide row tables with a natural
>> application key for the partitioning key and a timestamp key as a
>> cluster key.
>>
>> So I was considering the options:
>>
>> (a) Add a tenant column to each table and stick a secondary index on
>> that column;
>> (b) Add a tenant column to each table and maintain index tables that
>> use the tenant id as a partitioning key;
>> (c) Decompose the partitioning key of each table and add the tenant
>> and the leading component of the key;
>> (d) Add the tenant as a separate clustering key;
>> (e) Replicate the schema in separate tenant specific key spaces;
>> (f) Something I may have missed;
>>
>> Option (a) seems the easiest, but I'm wary of just adding secondary
>> indexes without thinking about it.
>>
>> Option (b) seems to have the least impact of the layout of the
>> storage, but a cost of maintaining each index table, both code wise
>> and in terms of performance.
>>
>> Option (c) seems quite straight forward, but I feel it might have a
>> significant effect on the distribution of the rows, if the cardinality
>> of the tenants is low.
>>
>> Option (d) seems simple enough, but it would mean that you couldn't
>> query for a range of tenants without supplying a range of natural
>> application keys, through which you would need to iterate (under the
>> assumption that you don't use an ordered partitioner).
>>
>> Option (e) appears relatively straight forward, but it does mean that
>> the application CQL client needs to maintain separate cluster
>> connections for each tenant. Also I'm not sure to what extent key
>> spaces were designed to partition identically structured data.
>>
>> Does anybody have any experience with running a multi-tenanted
>> Cassandra app, or does this just depend too much on the specifics of
>> the application?
>>
>> Cheers,
>>
>> Ben
>>
>
>
>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Mime
View raw message