Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 92093 invoked from network); 14 Apr 2011 09:12:34 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Apr 2011 09:12:34 -0000 Received: (qmail 3479 invoked by uid 500); 14 Apr 2011 09:12:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3432 invoked by uid 500); 14 Apr 2011 09:12:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3424 invoked by uid 99); 14 Apr 2011 09:12:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Apr 2011 09:12:31 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a81.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Apr 2011 09:12:25 +0000 Received: from homiemail-a81.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTP id A966BA8064 for ; Thu, 14 Apr 2011 02:12:04 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=Ut7NzIW7bo sHZ1OYu/6/8WzyvvW68J2sq6KwfAzac4QiKiWi/rdLdD2ASOhgbJvpQA4g/u8tGZ KGeIHcFGxLFBP/0PR4yTMgNsFx+4ROIIwPiu/O7xg80toPi+Uwwt8qY/7WpNbo3e lNzuEVWdlDRsjBHgyLqQhJMK4PrXei86E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=E/2fkEtm5tRQELCY BEPvYrVhEKY=; b=c7L1odXthIEqzPss6p/lIzoXcIXt3pKTeU+FT6f3tJg0xFJc sx21mbHFU7kZ51dKZ3uRzhYGIrvW4rbySBMQMIdvyvjXvuggXBlWIf4qcj89seib F3MdKG6Z7QP1UtrX2SuGzKFjpgycUJ+lM6d4ktNnFbTbPfSFFchmWakvkH0= Received: from [10.0.1.155] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a81.g.dreamhost.com (Postfix) with ESMTPSA id 34E73A805C for ; Thu, 14 Apr 2011 02:12:04 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-26-195089474 Subject: Re: Indexes on heterogeneous rows Date: Thu, 14 Apr 2011 21:12:00 +1200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <8B700116-AFCF-490E-9C6F-19F0A2F76D21@thelastpickle.com> X-Mailer: Apple Mail (2.1084) --Apple-Mail-26-195089474 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Need to clear up some terminology here.=20 Rows have a key and can be retrieved by key. This is *sort of* the = primary index, but not primary in the normal RDBMS sense.=20 Rows can have different columns and the column names are sorted and can = be efficiently selected. There are "secondary indexes" in cassandra 0.7 based on column values = http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes So you could create secondary indexes on the a,e, and h columns and get = rows that have specific values. There are some limitations to secondary = indexes, read the linked article.=20 Or you can make your own secondary indexes using row keys as the index = values. If you have billions of rows, how many do you need to read back at once? Hope that helps Aaron =20 On 14 Apr 2011, at 04:23, David Boxenhorn wrote: > Is it possible in 0.7.x to have indexes on heterogeneous rows, which = have different sets of columns? >=20 > For example, let's say you have three types of objects (1, 2, 3) which = each had three members. If your rows had the following pattern >=20 > type=3D1 a=3D? b=3D? c=3D? > type=3D2 d=3D? e=3D? f=3D? > type=3D3 g=3D? h=3D? i=3D? >=20 > could you index "type" as your primary index, and also index "a", "e", = "h" as secondary indexes, to get the objects of that type that you are = looking for? >=20 > Would it work if you had billions of rows of each type? --Apple-Mail-26-195089474 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Need = to clear up some terminology here. 

Rows have a = key and can be retrieved by key. This is *sort of* the primary index, = but not primary in the normal RDBMS sense. 
Rows can have = different columns and the column names are sorted and can be efficiently = selected.
There are "secondary indexes" in cassandra 0.7 based = on column values http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary= -indexes

So you could create secondary = indexes on the a,e, and h columns and get rows that have specific = values. There are some limitations to secondary indexes, read the linked = article. 

Or you can make your own = secondary indexes using row keys as the index = values.

If you have billions of rows, how many = do you need to read back at once?

Hope that = helps
Aaron
   =  
On 14 Apr 2011, at 04:23, David Boxenhorn = wrote:

Is it possible in 0.7.x to have indexes = on heterogeneous rows, which have different sets of columns?

For = example, let's say you have three types of objects (1, 2, 3) which each = had three members. If your rows had the following pattern

type=3D1 a=3D? b=3D? c=3D?
type=3D2 d=3D? e=3D? f=3D?
type=3D3 = g=3D? h=3D? i=3D?

could you index "type" as your primary index, = and also index "a", "e", "h" as secondary indexes, to get the objects of = that type that you are looking for?

Would it work if you had billions of rows of each type?

= --Apple-Mail-26-195089474--