lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <>
Subject Last attempt
Date Thu, 26 Jul 2007 19:56:27 GMT
Some time ago I tried to introduce LUCENE-581, a new consumer top  
layer, the core changes required by LUCENE-550, my InstantaitedIndex.  
I would still like to see this a part of the core. It is completely  
backwards compatible but contains a few small changes that seems to  
be convtroversial, and I'm honestly not sure why:

* Complete definalization of Term, Document and IndexReader.
* IndexWriterInterface

In my eyes, the only thing these things do are to limit Lucene  
development to the file-centric Directory store. There is nothing  
wrong with Dicretory, I just want to be able to use the same code for  
any store design of my chooise. I want unison index handling, no  
matter the implementation. One line of code that switch between  
Directory, BDB, MemoryIndex, InstantiatedIndex or what not.

This post is about InstantiatedIndex and the things I built upon it.  
As time it passed I just gave up on keeping them up to date. It is in  
use at this one place where it is just spinning on with no need to  
update, stuck to Lucene 2.0 or so. We are now getting close to Lucene  
3.0 and I would hate to see this code get lost in time.

It has so many neat features. Beeing really really fast on small  
corpuses is just one.

In essense the design is similar to contrib/MemoryIndex, but it can  
hold multiple documents.

The definalization and interface also allows for index insert/delete/ 
optimization notifications.

These two features combined yeilded in an active cache (not really  
used in any project, just a proof-of-concept I experimented with on a  
site where a lot of users place the exact same query) that update  
cached results only when affected by new data. Could be done with  
MemoryIndex too, but not as fast as InstantiatedIndex can handle  
batches of documents.

One can however do alot of other things with it.

In LUCENE-626 I also use InstantiatedIndex, getting some 10-20 times  
faster response times from my contrib/spellcheck augmentation than  
when using a RAMDirectory.

There are more features and potentially cool things one might want to  
consider in the 550-patch/UML diagram.

Would the changes to the core InstantiatedIndex require ever be  
committed? Then I could sit down and bring these patches up to date.  
Otherwise I'll just let them become some depricated artifact I use  
for a couple of things such as spellchecking, rather than a neat  
augmentation of Lucene I could use for any future development.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message