incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Nine <t...@spidertracks.com>
Subject Secondary indexing status
Date Sat, 09 Oct 2010 06:02:16 GMT
Hi all,
  I did a bit of digging around in JIRA today, and I have a few questions.
 I'm about to update my Datanucleus Cassandra plugin.

http://github.com/tnine/Datanucleus-Cassandra-Plugin

I've built my own secondary indexing scheme, which is essentially a
simplified port of the Lucandra format of storage.  Paging does not work
well, and unions/intersections are quite memory intensive and slow due to
I/O with the client.  I'd prefer to have Cassandra do a majority of the
searching, then simply return the result set to my plugin.  For this, I need
the following functionality


Boolean Ops:  && ||

Equality Ops: < <= > >=

Max Size

Paging: Ex from 20 to 40 of the result set.  To keep consistent paging, the
user would have to supply the time of the initial query to provide
consistent result set to perform the page and return a subset of the
results.


>From the 0.7 beta 2 api, I need to implement the || operand and the paging.

I'd rather help out and contribute this to the Cassandra project then
utilize it within Datanucleus than build a client side plugin.  I've done a
bit of digging on JIRA, but could not find any existing issues that relate
to adding to adding operands, or paging.  Are there any existing issues I
can begin taking a look at?  I have all next week at work dedicated to
migrating to 0.7, so I'd like to utilize this time to try and contribute
this functionality to Cassandra.


The most significant client facing change would be the Thrift API.  I would
need to add the ability to for clients to pass trees of boolean operations.
 Any guidance/help about how I should approach this issue would be greatly
appreciated.


Thanks,
Todd

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message