Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 92812 invoked from network); 9 Feb 2011 11:10:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Feb 2011 11:10:17 -0000 Received: (qmail 82373 invoked by uid 500); 9 Feb 2011 11:10:15 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 82199 invoked by uid 500); 9 Feb 2011 11:10:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 82190 invoked by uid 99); 9 Feb 2011 11:10:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Feb 2011 11:10:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stuhood@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-wy0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Feb 2011 11:10:06 +0000 Received: by wyf23 with SMTP id 23so39541wyf.31 for ; Wed, 09 Feb 2011 03:09:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=/ECCjpn9nRTCRsqCSq37HYM2IqXQ78gJ7l2JdysDdEo=; b=PQz48gnl0F+9utbmG1znm96RgYkzCogSSS64Uh8MmxQvn2xXh3QVIp8okBrsObnCAn qzFJLQN8QFIZ7t0hvZpt9i7O7DP/hz4ym6q0pVbtK0gME7i0Er8GNcQ90Zc4NNezgsj/ h+MK1KHUP99OsLFuN5m5Satcr40LAFUupXl4s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=tixgmoUf8+G87inwygcdK1JFqxTPLC9VZdUhw7gm9Sktou/TFg2fgKsC/BJwI4XYks cr8ROnqZT9I82CI7H2fTD6ADSc3xyjX7OsCfs+5DjWPiCoo74s5Kz+hxLVcM0iYCPCH0 uyOiM6lIywxhxNwS7F1fitdSBZWY9YBxP/qX8= MIME-Version: 1.0 Received: by 10.216.176.80 with SMTP id a58mr16268095wem.82.1297249784857; Wed, 09 Feb 2011 03:09:44 -0800 (PST) Received: by 10.216.50.198 with HTTP; Wed, 9 Feb 2011 03:09:44 -0800 (PST) In-Reply-To: <59640.150.140.193.14.1297247703.squirrel@webmail.ceid.upatras.gr> References: <59640.150.140.193.14.1297247703.squirrel@webmail.ceid.upatras.gr> Date: Wed, 9 Feb 2011 03:09:44 -0800 Message-ID: Subject: Re: How do secondary indices work From: Stu Hood To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e65c88729c8407049bd780f0 --0016e65c88729c8407049bd780f0 Content-Type: text/plain; charset=ISO-8859-1 Alexander: The secondary indexes in 0.7.0 (type KEYS) are stored internally in a column family, and are kept synchronized with the base data via locking on a local node, meaning they are always consistent on the local node. Eventual consistency still applies between nodes, but a returned result will always match your query. This index column family stores a mapping from index values to a sorted list of matching row keys. When you query for rows between x and y matching a value z (via the get_indexed_slices call), Cassandra performs a lookup to the index column family for the slice of columns in row z between x and y. If any matches are found in the index, they are row keys that match the index clause, and we query the base data to return you those rows. Iterating through all of the rows matching an index clause on your cluster is guaranteed to touch N/RF of the nodes in your cluster, because each node only knows about data that is indexed locally. Some portions of the indexing implementation are not fully baked yet: for instance, although the API allows you to specify multiple columns, only one index will actually be used per query, and the rest of the clauses will be brute forced. A second secondary index implementation has been on the back burner for a while: it provides an identical API, but does not use a column family to store the index, and should be more efficient for append only data. See https://issues.apache.org/jira/browse/CASSANDRA-1472 Thanks, Stu On Wed, Feb 9, 2011 at 2:35 AM, wrote: > Thank you for the links, I did read a bit in the comments of the ticket, > but I couldn't get much out of it. > > I am mainly interested in how the index is stored and partitioned, not how > it is used. I think the people in the dev list will probably be better > qualified to answer that. My questions always seem to get moved to the > user list, and usually with good cause, but I think this time it should be > in the dev list :) Please move it back, if you can. > > Alexander > > > AFAIK this was the ticket the original work was done under > > https://issues.apache.org/jira/browse/CASSANDRA-1415 > > > > also http://www.datastax.com/docs/0.7/data_model/secondary_indexes > > and http://pycassa.githubcom/pycassa/tutorial.html#indexes may help > > > > (sorry on reflection the email prob did not need to be moved from dev, my > > bad) > > Aaron > > > > On 09 Feb, 2011,at 09:16 AM, Aaron Morton > wrote: > > > > Moving to the user group. > > > > > > > > On 08 Feb, 2011,at 11:39 PM, altanis@ceid.upatras.gr wrote: > > > > Hello, > > > > I'd like some information about how secondary indices work under the > hood. > > > > 1) Is data stored in some external data structure, or is it stored in an > > actual Cassandra table, as columns within column families? > > 2) Is data stored sorted or not? How is it partitioned? > > 3) How can I access index data? > > > > Thanks in a advance, > > > > Alexander Altanis > > > --0016e65c88729c8407049bd780f0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Alexander:

The secondary indexes in 0.7.0=A0(type KEYS)= =A0are stored internally in a column family, and are kept synchronized with= the base data via locking on a local node, meaning they are always consist= ent on the local node. Eventual consistency still applies between nodes, bu= t a returned result will always match your query.

This index column family stores a mapping from index va= lues to a sorted list of matching row keys. When you query for rows between= x and y matching a value z (via the get_indexed_slices call), Cassandra pe= rforms a lookup to the index column family for the slice of columns in row = z between x and y. If any matches are found in the index, they are row keys= that match the index clause, and we query the base data to return you thos= e rows.

Iterating through all of the ro= ws matching an index clause on your cluster is guaranteed to touch N/RF of = the nodes in your cluster, because each node only knows about data that is = indexed locally.

Some portions of the indexing implementation are not fu= lly baked yet: for instance, although the API allows you to specify multipl= e columns, only one index will actually be used per query, and the rest of = the clauses will be brute forced.

A second secondary index implementation has been on the= back burner for a while: it provides an identical API, but does not use a = column family to store the index, and should be more efficient for append o= nly data. See=A0https://issues.apache.org/jira/browse/CASSANDRA-1472

Thanks,
Stu

On Wed, Feb 9, 2011 at 2:35 AM, <altanis@ceid.upatras.gr> wrote:
Thank you for the links, I did read a bit in the comments of the ticket, but I couldn't get much out of it.

I am mainly interested in how the index is stored and partitioned, not how<= br> it is used. I think the people in the dev list will probably be better
qualified to answer that. My questions always seem to get moved to the
user list, and usually with good cause, but I think this time it should be<= br> in the dev list :) Please move it back, if you can.

Alexander

> AFAIK this was the ticket the original work was done under=A0
> https://issues.apache.org/jira/browse/CASSANDRA-1415
>
> also =A0http://www.datastax.com/docs/0.7/data_model/s= econdary_indexes
> and =A0http://pycassa.githubcom/pycassa/tutorial.html#= indexes=A0may help
>
> (sorry on reflection the email prob did not need to be moved from dev,= my
> bad)
> Aaron
>
> On 09 Feb, 2011,at 09:16 AM, Aaron Morton <aaron@thelastpickle.com> wrote:
>
> Moving to the user group.
>
>
>
> On 08 Feb, 2011,at 11:39 PM, altanis@ceid.upatras.gr wrote:
>
> Hello,
>
> I'd like some information about how secondary indices work under t= he hood.
>
> 1) Is data stored in some external data structure, or is it stored in = an
> actual Cassandra table, as columns within column families?
> 2) Is data stored sorted or not? How is it partitioned?
> 3) How can I access index data?
>
> Thanks in a advance,
>
> Alexander Altanis
>

--0016e65c88729c8407049bd780f0--