incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Yaskevich <pove...@gmail.com>
Subject Re: Cassandra in memory key index
Date Fri, 08 Jun 2012 19:50:21 GMT
I would vote,  if possible, to compare it with y-fast trie [1] (it doesn't seem to be available
java implementation unfortunately) by means of memory efficiency and lookup performance. As
we use big integer tokens the main benefit from that trie could be O(log log M) predecessor
lookup and compact in-memory size. 

[1] https://en.wikipedia.org/wiki/Y-fast_trie

Best Regards
-- 
Pavel Yaskevich


On Friday 8 June 2012 at 22:19, Jason Rutherglen wrote:

> Ok looks like the IndexSummary encapsulates everything, I can start with
> hacking that.
> 
> On Fri, Jun 8, 2012 at 11:50 AM, Jason Rutherglen <
> jason.rutherglen@gmail.com (mailto:jason.rutherglen@gmail.com)> wrote:
> 
> > The Cassandra integration is probably beyond the time I have available.
> > If the locations in the code that need to be rewritten to use the FST are
> > known, and a patch simply 'plugs-in' the FST, that would be much easier.
> > Eg, I don't know how Cassandra stores the current key index for example...
> > 
> > Basically I can write FST serializing, deserializing, and key lookup code
> > fairly easy by basing it on Lucene's terms dict.
> > 
> > 
> > On Fri, Jun 8, 2012 at 6:00 AM, Radim Kolar <hsn@filez.com (mailto:hsn@filez.com)>
wrote:
> > 
> > > 
> > > If you are interested I can help, I used the FST on a Hadoop project
> > > > to implement a fast map side range join.
> > > 
> > > create JIRA item with patch attached, i will test it.
> > > 
> > 
> > 
> 
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message