incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: Tree Search in Cassandra
Date Mon, 07 Jun 2010 06:43:08 GMT
Quote from Gary:

batch_mutate makes no atomicity guarantees.  It s intended to help avoiding
many round-trips.
It can fail half-way through leaving you with a partially completed batch.

On Mon, Jun 7, 2010 at 9:39 AM, David Boxenhorn <david@lookin2.com> wrote:

> Is batch mutate atomic? If not, can we make it so?
>
> On Mon, Jun 7, 2010 at 4:11 AM, Tatu Saloranta <tsaloranta@gmail.com>wrote:
>
>> Yeah, or maybe just "clustering", since there is no branching structure.
>> It's quite commonly useful even on regular b-tree style storage (BDB
>> et al), as it can reduce per-entry overhead quite a bit. And allows
>> very efficient compression, if entries have lots of redundancy (xml or
>> json serialized data).
>>
>> I doubt this can be done reliably from client perspective. While a
>> good idea from functionality perspective, problem is that it requires
>> some level of atomic operations or locking, since updates are
>> multi-step operations. From server side I guess it would be similar to
>> work on allowing atomic multi-part operations (like ones being worked
>> on to implement counters?).
>>
>> -+ Tatu +-
>>
>> On Sun, Jun 6, 2010 at 2:19 AM, Ran Tavory <rantav@gmail.com> wrote:
>> > sounds interesting... btree on top of cassandra ;)
>> >
>> > On Sun, Jun 6, 2010 at 12:16 PM, David Boxenhorn <david@lookin2.com>
>> wrote:
>> >>
>> >> I'm still thinking about the problem of how to handle range queries on
>> >> very large sets of data, using Random Partitioning.
>> >>
>> >> Has anyone used tree search to solve this? What do you think?
>> >>
>> >> More specifically, something like this:
>> >>
>> >> - Store a maximum of 1000 values per supercolumn (or some other fixed
>> >> number)
>> >> - Each supercolumn has a "greaterChild" and a "lessChild" in addition
>> to
>> >> the values
>> >> - When the number of values in the supercolumn grows beyond the
>> maximum,
>> >> split it into 3 parts, with the top third going into "greaterChild" and
>> the
>> >> bottom third into "lessChild"
>> >> - To find a value, look at "greaterChild" and "lessChild" to find out
>> >> whether your key is within the current range, and if not, where to look
>> next
>> >> - Range searches mean finding the first value, then looking at
>> >> "greaterChild" or "lessChild" (depending on the direction of your
>> search)
>> >> until you reach the end of the range.
>> >>
>> >> Super Column Family:
>> >>
>> >> index [ <columnFamilyId> [ "firstVal" : <val> ,
>> >>                            "lastVal" : <val> ,
>> >>                            <val> : <dataId>,
>> >>                            "lessChild" : <columnFamilyId> ,
>> >>                            "greaterChild" : <columnFamilyId> ]
>> >>
>> >
>> >
>>
>
>

Mime
View raw message