lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane James Vaucher <>
Subject Re: index update (was Re: Large InputStream.BUFFER_SIZE causes OutOfMemoryError.. FYI)
Date Tue, 13 Apr 2004 17:09:04 GMT
I'm actually pretty lazy about index updates, and haven't had the need for 
efficiency, since my requirement is that new documents should be 
available on a next working day basis.

I reindex everything from scatch every night (400,000 docs) and store it 
in an timestamped index. When the reindexing is done, I alert a controller 
of the new active index. I keep a few versions of the index in case of 
a failure somewhere and I can always send a message to the controller to 
use an old index.


On Tue, 13 Apr 2004, petite_abeille wrote:

> On Apr 13, 2004, at 02:45, Kevin A. Burton wrote:
> > He mentioned that I might be able to squeeze 5-10% out of index merges 
> > this way.
> Talking of which... what strategy(ies) do people use to minimize 
> downtime when updating an index?
> My current "strategy" is as follow:
> (1) use a temporary RAMDirectory for ongoing updates.
> (2) perform a "copy on write" when flushing the RAMDirectory into the 
> persistent index.
> The second step means that I create an offline copy of a live index 
> before invoking addIndexes() and then substitute the old index with the 
> new, updated, one. While this effectively increase the time it takes to 
> update an index, it nonetheless reduce the *perceived* downtime for it.
> Thoughts? Alternative strategies?
> TIA.
> Cheers,
> PA.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message