incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Nine <t...@spidertracks.co.nz>
Subject Re: Secondary indexing and 0.6/0.7 integration with Datanucleus
Date Wed, 16 Jun 2010 04:57:53 GMT
No problem,
  I didn't want to implement my own solution if an existing one could
easily be applied.  Since I'll be creating CF that represent secondary
indexes, I'll need to perform range scans over the keys of those
secondary index CFs.  The column names within the CF's are the row keys
of the primary table.  Is there a way I can get the intersection of all
of the column names from multiple ranges scans over different column
families in one result set?  Otherwise I'll need to make multiple trips
and create the intersection myself in my plugin.  Here is an example of
what I'm trying to do.

CF: Person

key1: {
   firstName: John
   lastName: Smith
   email: smiths@foo.com
}

key2: {
  firstName: Jane
  lastName: Smith
  email: smiths@foo.com
}

key3: {
  firstName: Jane
  lastName: Doe
  email: smiths@foo.com
}


My secondary index tables would be the following

CF: Person_LastName

Smith:{
  key1: 0x00
  key2: 0x00
}

Doe: {
  key3:0x00
}

CF: Person_Email
  smiths@foo.com:{
    key1:0x00
    key2:0x00 
    key3:0x00
}

If my input is something similar to lastName == 'Smith' && email ==
"smiths@foo.com", I would return all columns from key "Smith" in CF
Person_LastName, and all columns from key "smiths@foo.com" in CF
Person_Email.  The intersection of the two sets is key1, and key2, and
have cassandra only return those rows.

Thanks,
Todd





On Tue, 2010-06-15 at 23:38 -0500, Jonathan Ellis wrote:

> No chance that 749 can be backported to 0.6, sorry.
> 
> On Tue, Jun 15, 2010 at 10:35 PM, Todd Nine <todd@spidertracks.co.nz> wrote:
> 
> >  Lets try that again.....
> >
> > This is the intended issue.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-749
> >
> > thanks,
> > Todd
> >
> >
> >
> >   On Tue, 2010-06-15 at 20:02 -0500, Jonathan Ellis wrote:
> >
> > What issue were you trying to link? :)
> >
> > On Tue, Jun 15, 2010 at 6:56 PM, Todd Nine <todd@spidertracks.co.nz> wrote:
> > > Hi all,
> > >  I'm implementing a Datanucleus plugin for Cassandra.  I'm finished
> > > with the basic functionality, and everything seems to work pretty well.
> > > Now my issue is performing secondary indexing on fields within my data.
> > > I have outlined some of the issues I'm facing in this post.
> > >
> > > http://www.datanucleus.org/servlet/forum/viewthread_thread,6087_lastpage,yes#32610
> > >
> > > Essentially, for each operand the user specifies, I will need to make a
> > > trip to Cassandra, load the key columns, then perform an intersection
> > > with the result from my previous read.  Eventually at the end of all the
> > > intersections, I will have a list of keys I will then load.  This
> > > obviously requires several trips to Cassandra, where from my
> > > understanding of secondary indexing, I would only need to make one trip
> > > for multiple operands over a column family.    I've read over this
> > > issue.
> > >
> > > http://issues.apache.org/jira/browse/CASSANDRA-32610
> > >
> > > And it seems to solve a lot of my woes.  Is it possible/recommended to
> > > patch the current code base of 0.6.2 to perform this functionality?
> > >
> > > Thanks,
> > > Todd
> > >
> > >
> >
> >
> >
> >
> >
> 
> 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message