incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evan Weaver <ewea...@gmail.com>
Subject Re: Alternative wire protocols
Date Tue, 30 Jun 2009 18:55:34 GMT
Since the API is so thorny right now, it would be extremely difficult
to automate filter calls. What if we had something like follows
(pardon my bogus syntax):

resultUnion_n get_one(column_family:string, key:string,
[super_column:string], [column:string], [options:Something_t])
list<resultUnion_n> get_all(column_family:string, key:string,
[super_column:string], [column:string], [options:Something_t])

where resultUnion_n is: (scalar | Column_t | SuperColumn_t)

Ideally Column_t and SuperColumn_t could be merged, but that's not a big deal.

The options struct/dict could have the composable filters of various kinds.

Is this even remotely possible?

Evan

On Sun, Jun 28, 2009 at 5:49 AM, Bill de hOra<bill@dehora.net> wrote:
> Evan Weaver wrote:
>>
>> I wanted to start a small discussion to see if there is any interest
>> in supporting alternative wire protocols or perhaps junking Thrift to
>> some degree.
>>
>> Some options:
>>  * Use JSON over HTTP
>>  * Use BSON over...something (http://www.mongodb.org/display/DOCS/BSON)
>>  * Use ASN.1 over...something
>>  * Use Protocol Buffers over...something
>>  * Use Thrift, but package Cassandra-specific clients for each language
>>
>> I have not thought too coherently about this but generic Thrift seems
>> to be a pain point for everybody.
>
> Hi Evan,
>
> I've been playing around again with Cassandra recently and I agree Thrift is
> a pain point, and that was the case when I looked at the project originally.
> But I think it's not so much Thrift as how the data is presented to clients.
>
> Much more important to me is that to use Cassandra means reading and
> understanding the service api calls in cassandra.thrift. Personally I
> wouldn't have designed a fine grained API over the generic data structures
> implied by a colum store, where simple filters and selects become a litany
> of get_by_X calls. For example, 4 methods return list<column_t>, 2 return
> list<string>, 2 return list<superColumn_t>, there are 5 get_slice and 4
> get_column variants. And typical of RPC, none of this stuff composes. In
> something like Django there are chained filter() calls (Hibernate has
> similar Criteria calls) which makes for a stable programming API, where what
> you need to figure out the criteria to pass. With Cassandra you have to do
> that and find the right method; the API surface is much bigger. Simple
> keystores and dynamo style models get away with fine grained RPC as there's
> nothing much to do except the key lookup and multiget usecases. They're not
> a design sweetspot for column stores APIs imvho.
>
> I think the question for Cassandra is not so much about serialization
> techniques and speed as whether RPC is the best way to expose the data.
>
> Bill
>



-- 
Evan Weaver

Mime
View raw message