lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From v.se...@lombardodier.com
Subject Re: recurrent IO/CPU peaks
Date Tue, 01 Mar 2011 16:40:50 GMT
Hi,

we developped a real time logging system. we index 4.5 millions 
events/day, spread over multiple servers, each with its own index. every 
night with delete events from the index based on a retention policy then 
we optimize. each server takes between 1 and 2 hours to optimize. ideally, 
we would like to optimize more quickly, without compromising the search 
performances. in the lucene in action book, it says "use optimize 
sparingly; use the optimize(maxNumSegments) method instead". what is a 
reasonnable maxNumSegments in my situation?
Thanks,
Vincent







Michael McCandless <lucene@mikemccandless.com> 
 
 
01.03.2011 17:09
Please respond to
java-user@lucene.apache.org



To
java-user@lucene.apache.org
cc
v.sevel@lombardodier.com
Subject
Re: recurrent IO/CPU peaks






On Tue, Mar 1, 2011 at 3:17 AM,  <v.sevel@lombardodier.com> wrote:
> Hi, OK so I will not bother using TieredMergePolicy for now. I will do
> some more tests with the contrib balanced merge policy, playing with the
> optimize(maxNumSegments) to try decreasing the optimize time (which is 
an
> issue for us today). My index contains 35 millions documents. The size 
on
> disk is approx. 70 Gb. Are there any guidelines as to how to set
> maxNumSegments?

I don't think we have any guidelines yet... but if you get some
numbers then please post back :)  That's how guidelines develop!

But: why are you optimizing so often?

-- 
Mike

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




************************ DISCLAIMER ************************
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not
constitute a formal commitment by Lombard Odier
Darier Hentsch & Cie or any of its branches or affiliates.
If you are not the intended recipient of this message,
kindly notify the sender immediately and destroy this
message. Thank You.
*****************************************************************

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message