incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: How expensive are additional keyspaces?
Date Tue, 11 Mar 2014 16:22:40 GMT
So in the 0.6.X days a signature of a get looked something like this:

get(String keyspace, ColumnPath cp, String rowkey)

Besides changes form string -> ByteBuffer the keyspace was pulled out of
the argument.

I think the better more flexible way to do this would be:

struct GetRequest {
   1: optional keyspace,
   2: required rowkey
   3: optional columnPath
}

get(GetRequest g)

This would put some burden on clients to make builder objects instead of
calling methods, but it would make something easier to evolve I think.

However it is hard for me to justify making a second copy of each method
for this small use case. Otherwise I would take that up.




On Tue, Mar 11, 2014 at 12:07 PM, Peter Lin <woolfel@gmail.com> wrote:

>
> if I have time this summer, I may work on that, since I like having thrift.
>
>
> On Tue, Mar 11, 2014 at 12:05 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:
>
>> This mistake is not a thrift limitation. In 0.6.X you could switch
>> keyspaces without calling setKeyspace(String) methods specified the
>> keyspace in every operation. This is mirrors the StorageProxy class. In
>> 0.7.X setKeyspace() was created and the keyspace was removed from all these
>> thrift methods. I really dislike that change personally :)
>>
>> If someone was so motivated, they could pretty easily (a couple days
>> work) add new methods to thrift that do not have this limitation.
>>
>>
>>
>>
>> On Tue, Mar 11, 2014 at 11:39 AM, Jonathan Ellis <jbellis@gmail.com>wrote:
>>
>>> That is correct.  Another place where the mistakes of Thrift informed
>>> our development of the native protocol.
>>>
>>> On Tue, Mar 11, 2014 at 10:08 AM, Keith Wright <kwright@nanigans.com>
>>> wrote:
>>> > Does this whole true for the native protocol?  I've noticed that you
>>> can
>>> > create a session object in the datastax driver without specifying a
>>> keyspace
>>> > and so long as you include the keyspace in all queries instead of just
>>> table
>>> > name, it works fine.  In that case, I assume there's only one
>>> connection
>>> > pool for all keyspaces.
>>> >
>>> > From: Edward Capriolo <edlinuxguru@gmail.com>
>>> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> > Date: Tuesday, March 11, 2014 at 11:05 AM
>>> > To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>> > Subject: Re: How expensive are additional keyspaces?
>>> >
>>> > The biggest expense of them is that you need to be authenticated to a
>>> > keyspace to perform and operation. Thus connection pools are bound to
>>> > keyspaces. Switching a keyspace is an RPC operation. In the thrift
>>> client,
>>> > If you have 100 keyspaces you need 100 connection pools that starts to
>>> be a
>>> > pain very quickly.
>>> >
>>> > I suggest keeping everything in one keyspace unless you really need
>>> > different replication factors and or network replication settings per
>>> > keyspace.
>>> >
>>> >
>>> > On Tue, Mar 11, 2014 at 10:17 AM, Martin Meyer <elreydetodo@gmail.com>
>>> > wrote:
>>> >>
>>> >> Hey all -
>>> >>
>>> >> My company is working on introducing a configuration service system
to
>>> >> provide cofig data to several of our applications, to be backed by
>>> >> Cassandra. We're already using Cassandra for other services, and at
>>> >> the moment our pending design just puts all the new tables (9 of them,
>>> >> I believe) in one of our pre-existing keyspaces.
>>> >>
>>> >> I've got a few questions about keyspaces that I'm hoping for input on.
>>> >> Some Google hunting didn't turn up obvious answers, at least not for
>>> >> recent versions of Cassandra.
>>> >>
>>> >> 1) What trade offs are being made by using a new keyspace versus
>>> >> re-purposing an existing one (that is in active use by another
>>> >> application)? Organization is the obvious answer, I'm looking for any
>>> >> technical reasons.
>>> >>
>>> >> 2) Is there any per-keyspace overhead incurred by the cluster?
>>> >>
>>> >> 3) Does it impact on-disk layout at all for tables to be in a
>>> >> different keyspace from others? Is any sort of file fragmentation
>>> >> potentially introduced just by doing this in a new keyspace as opposed
>>> >> to an exiting one?
>>> >>
>>> >> 4) Does it add any metadata overhead to the system keyspace?
>>> >>
>>> >> 5) Why might we *not* want to make a separate keyspace for this?
>>> >>
>>> >> 6) Does anyone have experience with creating additional keyspaces to
>>> >> the point that Cassandra can no longer handle it? Note that we're
>>> >> *not* planning to do this, I'm just curious.
>>> >>
>>> >> Cheers,
>>> >> Martin
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder, http://www.datastax.com
>>> @spyced
>>>
>>
>>
>

Mime
View raw message