cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shaun Cutts <>
Subject Re: order of index expressions
Date Mon, 07 Feb 2011 23:59:43 GMT

Thanks for your thoughts....

> On Sun, Feb 6, 2011 at 11:03 AM, Shaun Cutts <> wrote:
>> What I think you should be doing is the following: open iterators on the matching
keys for each of the indexes; the inside loop would pick an iterator at random, and pull a
match from it. This would assure that the expected number of entries examined is a small multiple
(# of other indexes) of the index with the most "precision".
> Isn't this a bad approximation of performing an intersection by
> iterating through each index results (which, since they return in
> sorted order, can be done without horrible memory cost)?

I don't think so, unless I'm not understanding you. The point is... suppose you have 1M matches
on the key for one index, and 1 in the other (for a row among the 1M matching the first),
and you have no information about which is which. If you arbitrarily pick one to go first
at random, you loop 500K times on average.

If you do what I'm suggesting you loop 2 times on average -- a big difference :). This is
"breadth first search" but the breadth is just the number of indexes... if there are really
thousands of indexes you could bound to the X most likely, but in that case getting the order
right for optimal depth-first is quite unlikely, whereas since this is an "AND", for even
modest X you lower your chances of a "wrong" choice exponentially.

Also, is the amount of memory for a "cursor" in an index really that large compared with the
memory that you use for query specification in any case? Your current method does assume it
has random access to the query.

I do think its a flaw that something which can be done in constant time actually takes time
which could be proportional to the number of rows, just by picking the wrong index.

> I'm not opposed to doing that.  But I have no idea how to guess when
> that is better than the current inner-loop method.

What I'm proposing is always at most a constant times number of iterations worse (ok -- #
of times proportional to the number of indexes used), vs something that could be arbitrarily
much worse. But you could adapt the current heuristic to tune the probabilities, so that the
worse case will be asymptotically the same as current.... but the average case will be much

>> I know you have a new type of index in the works... but it doesn't look like "trunk"
has any modifications for "scan", and presumably the strategy I just mentioned is pretty general
(not depending on histograms, etc). Does it sound like a good idea?
> I think you're thinking of
>, and my
> understanding is that the bitmap approach basically does what you want
> only faster.
I'll have to study it. As far as I know what I'm proposing is orthogonal to histogram methods
that don't actually keep stats on the joins (which is a lot of stats to keep), and even in
this case it can be used to optimize "lower resolution" statistics to the partial experience
from sampling the current query. 

It does introduce extra overhead for small queries, of course, so if you had histograms that
guaranteed that the query was going to be very small for a particular index then it would
be best to put that first without fiddling. However, typically you have exact info for only
the common keys, not the rare ones.

-- Shaun

> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support

View raw message