incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: Secondary Indexes
Date Sun, 03 Apr 2011 19:51:44 GMT
I'm not familiar with some of the details, but I'll try to answer your
questions in general.  Secondary indexes are implemented as a slightly
special separate column family with the indexed value serving as the key;
most of the properties of secondary indexes follow from that.

On Sun, Apr 3, 2011 at 2:28 PM, Drew Kutcharian <drew@venarc.com> wrote:

> Hi Everyone,
>
> I posted the following email a couple of days ago and I didn't get any
> responses. Makes me wonder, does anyone on this list know/use Secondary
> Indexes? They seem to me like a pretty big feature and it's a bit
> disappointing to not be able to get a documentation on it.
>
> The only thing I could find on the Wiki was the end of
> http://wiki.apache.org/cassandra/StorageConfiguration and that was
> pointing to the non-existing page
> http://wiki.apache.org/cassandra/SecondaryIndexes . In addition, I checked
> the JIRA CASSANDRA-749 and there's a lot of back and forth that I couldn't
> really figure out what the conclusion was. What gives?
>
> I think the Cassandra committers are doing a heck of a job adding all these
> cool functionalities but the documenting side doesn't really keep
> up. Jonathan Ellis's blog post on Secondary Indexes only scratches the
> surface of the topic, and if you consider that the whole point of using
> Cassandra is scalability, there isn't a single mention of how Secondary
> Indexes scale!!! (This same thing applies to Counters too)
>
> I'm not trying to be a complainer, but as someone new to this community, I
> hope you guys take my comments as productive criticism.
>
> *Thanks,
>
> Drew*
>
>
> [ORIGINAL POST]
>
> *I just read Jonathan Ellis' great post on Secondary Indexes (**
> http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes*<http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes>
> *) and I was wondering where I can find a bit more info on them. I would
> like to know:
>
> 1) Are there in limitations beside the hash properties (no between
> queries)? Like size or memory, etc?*
>

No.


> *
> 2) Are there distributed? If so, how does that work? How are there stored
> on the nodes?
> *
>

Each node only indexes data that it holds locally.


> *
> 3) When you write a new row, when/how does the index get updated? What I
> would like to know is the atomicity of the operation, is the "index write"
> part of the "row write"?
> *
>

The row and index updates are one atomic operation.


> *
> 4) Is there a difference between creating a secondary index vs creating an
> "index" CF manually such as "users_by_country"?
>
> *
>

Yes.  First, when creating your own index, a node may index data held by
another node.  Second, updates to the index and data are not atomic.

Your feedback is certainly helpful and hopefully we can get some of these
details into the documentation!

-- 
Tyler Hobbs
Software Engineer, DataStax <http://datastax.com/>
Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra
Python client library

Mime
View raw message