lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Engels <>
Subject Re: optimize() method call
Date Sat, 07 Apr 2007 07:00:18 GMT
I think this is great, and it gave me an idea. What if another thread could call a "stop optimize"
which would stop the optimize after it came to a consistent state (not in the middle of a
segment merge).

We schedule our optimizes for the "lull" time period, but with 24/7 operation this could be
hard to find.

Being able to stop and then resume the optimize seems like a great idea.

-----Original Message-----
>From: Grant Ingersoll <>
>Sent: Apr 6, 2007 3:53 PM
>Subject: optimize() method call
>I was looking at the javadocs for the optimize() call on IndexWriter  
>which contain a great amount of detail about what happens, but very  
>little guidance on when.  I would like to add more on when.  I  
>generally do optimize after I finish my indexing, which is pretty  
>straightforward to determine when one has a more or less static  
>collection.  What isn't so clear to me, b/c I haven't dealt w/ it too  
>much is when optimize should be called in environments that are  
>frequently updated.
>Here's what I have for text so far:
>    * <p>It is recommended that this method be called upon completion  
>of indexing.  In
>    * environments with frequent updates optimize is best FILL IN HERE
>    * </p>
>Essentially, I am wondering what are the best practices for calling  
>optimize, especially in a frequent update environment.  My gut  
>feeling is that it should just be scheduled to be done on a regular  
>basis, ideally when there is a lull.  The docs allude to the fact  
>that search performance will be better, but has anyone quantified  
>it?  The mergeFactor docs say that a smaller merge factor results in  
>faster searches on unoptimized (I presume that means relatively  
>faster searches to higher merge factors, but still not as fast as  
>optimized, correct?)  If it hasn't been quantified, maybe I will try  
>to whip a benchmark for it.
>So, do people in these types of environment typically schedule  
>optimize to occur at night or every few hours, or what?  I know, "It  
>depends...", just am wondering if there is a general consensus that  
>would be useful to pass along to readers
>Grant Ingersoll
>Center for Natural Language Processing
>Read the Lucene Java FAQ at 
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message