lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: recurrent IO/CPU peaks
Date Tue, 01 Mar 2011 16:40:50 GMT

we developped a real time logging system. we index 4.5 millions 
events/day, spread over multiple servers, each with its own index. every 
night with delete events from the index based on a retention policy then 
we optimize. each server takes between 1 and 2 hours to optimize. ideally, 
we would like to optimize more quickly, without compromising the search 
performances. in the lucene in action book, it says "use optimize 
sparingly; use the optimize(maxNumSegments) method instead". what is a 
reasonnable maxNumSegments in my situation?

Michael McCandless <> 
01.03.2011 17:09
Please respond to

Re: recurrent IO/CPU peaks

On Tue, Mar 1, 2011 at 3:17 AM,  <> wrote:
> Hi, OK so I will not bother using TieredMergePolicy for now. I will do
> some more tests with the contrib balanced merge policy, playing with the
> optimize(maxNumSegments) to try decreasing the optimize time (which is 
> issue for us today). My index contains 35 millions documents. The size 
> disk is approx. 70 Gb. Are there any guidelines as to how to set
> maxNumSegments?

I don't think we have any guidelines yet... but if you get some
numbers then please post back :)  That's how guidelines develop!

But: why are you optimizing so often?


To unsubscribe, e-mail:
For additional commands, e-mail:

************************ DISCLAIMER ************************
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not
constitute a formal commitment by Lombard Odier
Darier Hentsch & Cie or any of its branches or affiliates.
If you are not the intended recipient of this message,
kindly notify the sender immediately and destroy this
message. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message