Return-Path: Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: (qmail 6090 invoked from network); 16 Jun 2010 04:57:18 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Jun 2010 04:57:18 -0000 Received: (qmail 90525 invoked by uid 500); 16 Jun 2010 04:57:17 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 90293 invoked by uid 500); 16 Jun 2010 04:57:15 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 90285 invoked by uid 99); 16 Jun 2010 04:57:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jun 2010 04:57:14 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.212.172] (HELO mail-px0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jun 2010 04:57:07 +0000 Received: by pxi5 with SMTP id 5so1654942pxi.31 for ; Tue, 15 Jun 2010 21:56:46 -0700 (PDT) Received: by 10.142.66.23 with SMTP id o23mr5724089wfa.321.1276664205961; Tue, 15 Jun 2010 21:56:45 -0700 (PDT) Received: from [10.0.1.90] (203-114-161-15.wir.sta.inspire.net.nz [203.114.161.15]) by mx.google.com with ESMTPS id s21sm3189776wff.0.2010.06.15.21.56.43 (version=SSLv3 cipher=RC4-MD5); Tue, 15 Jun 2010 21:56:45 -0700 (PDT) Subject: Re: Secondary indexing and 0.6/0.7 integration with Datanucleus From: Todd Nine Reply-To: todd@spidertracks.co.nz To: dev@cassandra.apache.org In-Reply-To: References: <1276646204.24794.20.camel@greenlantern.local> <1276659312.24794.22.camel@greenlantern.local> Content-Type: multipart/alternative; boundary="=-BnqzZofd3dgl0ZBYAqVr" Date: Wed, 16 Jun 2010 16:57:53 +1200 Message-ID: <1276664273.24794.40.camel@greenlantern.local> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 X-Virus-Checked: Checked by ClamAV on apache.org --=-BnqzZofd3dgl0ZBYAqVr Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit No problem, I didn't want to implement my own solution if an existing one could easily be applied. Since I'll be creating CF that represent secondary indexes, I'll need to perform range scans over the keys of those secondary index CFs. The column names within the CF's are the row keys of the primary table. Is there a way I can get the intersection of all of the column names from multiple ranges scans over different column families in one result set? Otherwise I'll need to make multiple trips and create the intersection myself in my plugin. Here is an example of what I'm trying to do. CF: Person key1: { firstName: John lastName: Smith email: smiths@foo.com } key2: { firstName: Jane lastName: Smith email: smiths@foo.com } key3: { firstName: Jane lastName: Doe email: smiths@foo.com } My secondary index tables would be the following CF: Person_LastName Smith:{ key1: 0x00 key2: 0x00 } Doe: { key3:0x00 } CF: Person_Email smiths@foo.com:{ key1:0x00 key2:0x00 key3:0x00 } If my input is something similar to lastName == 'Smith' && email == "smiths@foo.com", I would return all columns from key "Smith" in CF Person_LastName, and all columns from key "smiths@foo.com" in CF Person_Email. The intersection of the two sets is key1, and key2, and have cassandra only return those rows. Thanks, Todd On Tue, 2010-06-15 at 23:38 -0500, Jonathan Ellis wrote: > No chance that 749 can be backported to 0.6, sorry. > > On Tue, Jun 15, 2010 at 10:35 PM, Todd Nine wrote: > > > Lets try that again..... > > > > This is the intended issue. > > > > https://issues.apache.org/jira/browse/CASSANDRA-749 > > > > thanks, > > Todd > > > > > > > > On Tue, 2010-06-15 at 20:02 -0500, Jonathan Ellis wrote: > > > > What issue were you trying to link? :) > > > > On Tue, Jun 15, 2010 at 6:56 PM, Todd Nine wrote: > > > Hi all, > > > I'm implementing a Datanucleus plugin for Cassandra. I'm finished > > > with the basic functionality, and everything seems to work pretty well. > > > Now my issue is performing secondary indexing on fields within my data. > > > I have outlined some of the issues I'm facing in this post. > > > > > > http://www.datanucleus.org/servlet/forum/viewthread_thread,6087_lastpage,yes#32610 > > > > > > Essentially, for each operand the user specifies, I will need to make a > > > trip to Cassandra, load the key columns, then perform an intersection > > > with the result from my previous read. Eventually at the end of all the > > > intersections, I will have a list of keys I will then load. This > > > obviously requires several trips to Cassandra, where from my > > > understanding of secondary indexing, I would only need to make one trip > > > for multiple operands over a column family. I've read over this > > > issue. > > > > > > http://issues.apache.org/jira/browse/CASSANDRA-32610 > > > > > > And it seems to solve a lot of my woes. Is it possible/recommended to > > > patch the current code base of 0.6.2 to perform this functionality? > > > > > > Thanks, > > > Todd > > > > > > > > > > > > > > > > > > --=-BnqzZofd3dgl0ZBYAqVr--