lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject optimize() method call
Date Fri, 06 Apr 2007 22:53:13 GMT
I was looking at the javadocs for the optimize() call on IndexWriter  
which contain a great amount of detail about what happens, but very  
little guidance on when.  I would like to add more on when.  I  
generally do optimize after I finish my indexing, which is pretty  
straightforward to determine when one has a more or less static  
collection.  What isn't so clear to me, b/c I haven't dealt w/ it too  
much is when optimize should be called in environments that are  
frequently updated.

Here's what I have for text so far:
    * <p>It is recommended that this method be called upon completion  
of indexing.  In
    * environments with frequent updates optimize is best FILL IN HERE
    * </p>

Essentially, I am wondering what are the best practices for calling  
optimize, especially in a frequent update environment.  My gut  
feeling is that it should just be scheduled to be done on a regular  
basis, ideally when there is a lull.  The docs allude to the fact  
that search performance will be better, but has anyone quantified  
it?  The mergeFactor docs say that a smaller merge factor results in  
faster searches on unoptimized (I presume that means relatively  
faster searches to higher merge factors, but still not as fast as  
optimized, correct?)  If it hasn't been quantified, maybe I will try  
to whip a benchmark for it.

So, do people in these types of environment typically schedule  
optimize to occur at night or every few hours, or what?  I know, "It  
depends...", just am wondering if there is a general consensus that  
would be useful to pass along to readers

Grant Ingersoll
Center for Natural Language Processing

Read the Lucene Java FAQ at 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message