Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 14616 invoked from network); 25 Apr 2010 18:15:15 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Apr 2010 18:15:15 -0000 Received: (qmail 82694 invoked by uid 500); 25 Apr 2010 18:15:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 82670 invoked by uid 500); 25 Apr 2010 18:15:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 82662 invoked by uid 99); 25 Apr 2010 18:15:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 25 Apr 2010 18:15:14 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [209.17.164.128] (HELO eighteen.baremetal.com) (209.17.164.128) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 25 Apr 2010 18:15:07 +0000 Received: from [192.168.1.101] (bas1-toronto08-1176339907.dsl.bell.ca [70.29.133.195]) by eighteen.baremetal.com (8.13.1/8.13.1) with ESMTP id o3PIEjx7031305; Sun, 25 Apr 2010 11:14:45 -0700 From: Bob Hutchison Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: How do you construct an index and use it, especially in Ruby Date: Sun, 25 Apr 2010 14:14:39 -0400 Message-Id: Cc: Bob Hutchison To: user@cassandra.apache.org Mime-Version: 1.0 (Apple Message framework v1078) X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org Hi, I'm new to Cassandra and trying to work out how to do something that = I've implemented any number of times (e.g. TokyoCabinet, Perst, even the = filesystem using grep :-) I've managed to get some of this working in = Cassandra but not all. So here's the core of the situation. I have this opaque chunk of data that I want to store in Cassandra and = then find it again. I can generate a key when the data is created very easily, and I've = stored it in a straight forward manner: in a column with a key whose = value is the data. And I can retrieve it when I know the key. No = difficulties here at all, works fine. Now I want to index this data taking what I imagine to be a pretty = typical approach. Lets say there's two many-to-one indexes: 'colour', and 'size'. Each = colour value will have more than one chunk of data, same for size. What I thought I'd do is make a super column and index the chunk of data = kind of like: { 'colour' =3D> { 'blue' =3D> 1 }, 'size' =3D> { 'large' = =3D> 1}} with the key equal to the key of the chunk of data. And = Cassandra stores it without error like that. So using the Ruby gem, it'd = be something along the lines of: cassandra.insert(:Indexes, key-of-the-chunk-of-data, { 'colour' =3D> { = 'blue' =3D> 1 }, 'size' =3D> { 'large' =3D> 1 } }) Q1: is this a reasonable approach? It *seems* to be what I've read is = supposed to be done. The 1 is meaningless. Anyway, it executes without = error in Ruby. Q2: what is the syntax of the (Ruby) query to find the keys of all = 'blue' chunks of data? I'm assuming get_range is the correct method, but = what are the parameters? The docs say: get_range(column_family, = options=3D{}) but that seems to be missing a bit of detail, in = particular the super column name. Q2a: So I know there's a :start and :finish key supported in the options = hash, inclusive, exclusive respectively. How do you define a range for = equals with a UTF8 key? Surely not 'blue'.succ?? or by some kind of = suffix?? Q2b: How do you specify the super column name 'colour'? Looking at the = (Ruby) source of the get_range method and I'm unconvinced that this is = implemented (seems to be a constant '' used where the super column name = makes sense to be.) Anyway I ended up hacking at the Ruby gem's source to use the column = name where the '' was in the original, and didn't really get anywhere = useful (I can find nothing, or everything, nothing in between). Q3: If I am correct about what is supposed to be done, does the Ruby gem = support it? Q4: Does anyone know of some Ruby code that does and indexed lookup that = they could point me at. (lots of code that indexes but nothing that = searches by the index) I'll try to take a look at some of the other Cassandra client = implementations and see if I can get this model to work. Maybe just a = Ruby problem?? With any luck, it'll be me messing up. If it'd help I can post the source of what I have, but it'll need some = cleanup. Let me know. Thanks for taking the time to read this far :-) Bob ---- Bob Hutchison Recursive Design Inc. http://www.recursive.ca/ weblog: http://xampl.com/so ---- Bob Hutchison Recursive Design Inc. http://www.recursive.ca/ weblog: http://xampl.com/so