cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mimi Aluminium <mimi.alumin...@gmail.com>
Subject Re: cluster size, several cluster on one node for multi-tenancy
Date Fri, 18 Feb 2011 18:49:02 GMT
Nick,
Assuming I have a tenant that has only one CF, and I am using NetworkAware
repliaction strategy where the keys of this  CF are replicated 3 times, each
copy in a different DC (DC1,DC2,DC3)
Now lets assume the cluster holds 5 DCs. As far as I understand only the
servers that belong to the three DCs that hold a copy will build this
CF's memtable. The servers that belong to the other 2 DCs  (DC4,DC5) wont
have evidence to these CF nor this keyspace, am I correct?

I have additional more basic question as follows:
Is there a way to define two clusters on the same node?  Is it by
configuration in the storage-conf file or does it means additional Cassandra
daemon?
Thanks a lot,
Miriam

On Fri, Feb 18, 2011 at 12:08 PM, Nick Telford <nick.telford@gmail.com>wrote:

> Large numbers of keyspaces/column-families are not a good ideas as each
> column-family memtable requires it's own memory. If you have 1000 tenants in
> the same cluster, each with only 1 CF, regardless of the cluster size
> *every* node will require 1 memtable per tenant CF - 1000 memtables.
>
> This limitation is the primary reason for workarounds (such as "virtual
> keyspaces") to enable multi-tenant setups.
>
> You might have more luck partitioning tenants in to different clusters, but
> then you end up with potential hot-spots (where more active tenants generate
> more load on a specific cluster).
>
> Regards,
> Nick
>
>
> On 18 February 2011 09:55, Mimi Aluminium <mimi.aluminium@gmail.com>wrote:
>
>>  Thanks a lot for you suggestions,
>> I will check the virtual keyspace solution - btw, currently I am using
>> Thrift client with Pycassa, I am not familiar with Hector - does it mean
>> we'll need to move to Hector client?
>>
>> I thought of using keyspaces for each tenant, but I dont understand how to
>> define the whole cluster. Meaning, assuming the tenants are distributed
>> (replicated) across hundreds  of DCs each consists of tens of racks and
>> servers, so can I define a single cassandra cluster for all the servers? it
>> does not seem to be reasonable , this is the reason I thought of sepearating
>> the clusters. Please let me know how would you solve it?
>> Thanks,
>> Miriam
>>
>>
>>
>> On Thu, Feb 17, 2011 at 10:30 PM, Nate McCall <nate@datastax.com> wrote:
>>
>>> Hector's virtual keyspaces would work well for what you describe. Ed
>>> Anuff, who added this feature to Hector, showed me a working
>>> multi-tennancy based app the other day and it worked quite well.
>>>
>>> On Thu, Feb 17, 2011 at 1:44 PM, Norman Maurer <norman@apache.org>
>>> wrote:
>>> > Maybe you could make use of "Virtual Keyspaces".
>>> >
>>> > See this wiki for the idea:
>>> > https://github.com/rantav/hector/wiki/Virtual-Keyspaces
>>> >
>>> > Bye,
>>> > Norman
>>> >
>>> > 2011/2/17 Frank LoVecchio <frank@isidorey.com>:
>>> >> Why not just create some sort of ACL on the client side and use one
>>> >> Keyspace?  It's a lot less management.
>>> >>
>>> >> On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium <
>>> mimi.aluminium@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>> I really need your help in this matter.
>>> >>> I will try to simplify my problem and ask specific questions
>>> >>>
>>> >>> I am thinking of solving the multi-tenancy problem by providing
a
>>> separate
>>> >>> cluster per each tenant. Does it sound reasonable?
>>> >>> I can end-up with one node belongs to several clusters.
>>> >>> Does Cassandra support several clusters per node? Does it mean
>>> several
>>> >>> Cassandra daemons on each node? Do you recommend doing that ? what
is
>>> the
>>> >>> overhead? is there any link that explain how to do that?
>>> >>>
>>> >>> Thanks a lot,
>>> >>> Mimi
>>> >>>
>>> >>>
>>> >>> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <
>>> mimi.aluminium@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> Hi,
>>> >>>> We are interested in a multi-tenancy environment, that may consist
>>> of up
>>> >>>> to hundreds of data centers. The current design requires cross
rack
>>> and
>>> >>>> cross DC replication. Specifically, the per-tenant CFs will
be
>>> replicated 6
>>> >>>> times: in three racks,  with 2 copies inside a rack, the racks
will
>>> be
>>> >>>> located in at least two different DCs. In the future other
>>> replication
>>> >>>> policies will be considered. The application will decide where
>>> (which racks
>>> >>>> and DC)  to place each tenant's replicas.  and it might be that
one
>>> rack can
>>> >>>> hold more than one tenant.
>>> >>>>
>>> >>>> Separating each tenant in a different keyspace, as was suggested
>>> >>>> in  previous mail thread in this subject, seems to be a good
>>> approach
>>> >>>> (assuming the memtable problem will be solved somehow).
>>> >>>> But then we had concern with regard to the cluster size.
>>> >>>> and here are my questions:
>>> >>>> 1) Given the above, should I define one Cassandra cluster that
hold
>>> all
>>> >>>> the DCs? sounds not reasonable  given hundreds DCs tens of servers
>>> in each
>>> >>>> DC etc. Where is the bottleneck here? keep-alive messages, the
>>> gossip,
>>> >>>> request routing? what is the largest number of servers a cluster
can
>>> bear?
>>> >>>> 2) Now assuming that I can create the per-tenant  keyspace only
for
>>>  the
>>> >>>> servers that in the three racks where the replicas are held,
 does
>>> such
>>> >>>> definition reduces the messaging transfer among the other servers.
>>> Does
>>> >>>> Cassandra optimizes the message transfer in such case?
>>> >>>> 3) Additional possible solution was to create a separate clusters
>>> per
>>> >>>> each tenant. But it can cause a situation where one server has
to
>>> run two or
>>> >>>> more Cassandra's clusters. Can we run more than one cluster
in
>>> parallel,
>>> >>>> does it means two cassandra daemons / instances on one server?
what
>>> will be
>>> >>>> the overhead? do you have a link that explains how to deal with
it?
>>> >>>>
>>> >>>> Please can you help me to decide which of these solution can
work or
>>> you
>>> >>>> are welcome to suggest something else.
>>> >>>> Thanks a lot,
>>> >>>> Mimi
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Frank LoVecchio
>>> >> Senior Software Engineer | Isidorey, LLC
>>> >> Google Voice +1.720.295.9179
>>> >> isidorey.com | facebook.com/franklovecchio | franklovecchio.com
>>> >>
>>> >
>>>
>>
>>
>

Mime
View raw message