incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@lookin2.com>
Subject Re: Tree Search in Cassandra
Date Mon, 07 Jun 2010 06:39:08 GMT
Is batch mutate atomic? If not, can we make it so?

On Mon, Jun 7, 2010 at 4:11 AM, Tatu Saloranta <tsaloranta@gmail.com> wrote:

> Yeah, or maybe just "clustering", since there is no branching structure.
> It's quite commonly useful even on regular b-tree style storage (BDB
> et al), as it can reduce per-entry overhead quite a bit. And allows
> very efficient compression, if entries have lots of redundancy (xml or
> json serialized data).
>
> I doubt this can be done reliably from client perspective. While a
> good idea from functionality perspective, problem is that it requires
> some level of atomic operations or locking, since updates are
> multi-step operations. From server side I guess it would be similar to
> work on allowing atomic multi-part operations (like ones being worked
> on to implement counters?).
>
> -+ Tatu +-
>
> On Sun, Jun 6, 2010 at 2:19 AM, Ran Tavory <rantav@gmail.com> wrote:
> > sounds interesting... btree on top of cassandra ;)
> >
> > On Sun, Jun 6, 2010 at 12:16 PM, David Boxenhorn <david@lookin2.com>
> wrote:
> >>
> >> I'm still thinking about the problem of how to handle range queries on
> >> very large sets of data, using Random Partitioning.
> >>
> >> Has anyone used tree search to solve this? What do you think?
> >>
> >> More specifically, something like this:
> >>
> >> - Store a maximum of 1000 values per supercolumn (or some other fixed
> >> number)
> >> - Each supercolumn has a "greaterChild" and a "lessChild" in addition to
> >> the values
> >> - When the number of values in the supercolumn grows beyond the maximum,
> >> split it into 3 parts, with the top third going into "greaterChild" and
> the
> >> bottom third into "lessChild"
> >> - To find a value, look at "greaterChild" and "lessChild" to find out
> >> whether your key is within the current range, and if not, where to look
> next
> >> - Range searches mean finding the first value, then looking at
> >> "greaterChild" or "lessChild" (depending on the direction of your
> search)
> >> until you reach the end of the range.
> >>
> >> Super Column Family:
> >>
> >> index [ <columnFamilyId> [ "firstVal" : <val> ,
> >>                            "lastVal" : <val> ,
> >>                            <val> : <dataId>,
> >>                            "lessChild" : <columnFamilyId> ,
> >>                            "greaterChild" : <columnFamilyId> ]
> >>
> >
> >
>

Mime
View raw message