lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From v.se...@lombardodier.com
Subject Re: recurrent IO/CPU peaks
Date Tue, 01 Mar 2011 08:17:47 GMT
Hi, OK so I will not bother using TieredMergePolicy for now. I will do 
some more tests with the contrib balanced merge policy, playing with the 
optimize(maxNumSegments) to try decreasing the optimize time (which is an 
issue for us today). My index contains 35 millions documents. The size on 
disk is approx. 70 Gb. Are there any guidelines as to how to set 
maxNumSegments? 

Thanks,


Vincent 







Michael McCandless <lucene@mikemccandless.com> 
 
 
28.02.2011 23:47



To
v.sevel@lombardodier.com
cc

Subject
Re: recurrent IO/CPU peaks






Well... TieredMergePolicy really won't make much difference in your 
optimize time... only (hopefully) in the "normal merge" cost.

But it's still a work in progress and I haven't yet backported to 3.x.

Are you using trunk or 3.x?

Mike

On Tue, Feb 22, 2011 at 8:01 AM, <v.sevel@lombardodier.com> wrote:

Hi, I downloaded from the trunk and tried to apply the patch, but sources 
did not match. 
Would you be able to provide me with a jar that contained just the 
TieredMergePolicy extra policy, or give me a lucene-core that has it? 
I would be very interested in testing this other policy. 
Thanks, 


Vincent 






Michael McCandless <lucene@mikemccandless.com> 
 

22.02.2011 12:49 


To
java-user@lucene.apache.org 
cc
v.sevel@lombardodier.com 
Subject
Re: recurrent IO/CPU peaks








On Tue, Feb 22, 2011 at 3:15 AM,  <> wrote:

> Here is how long it took for each run :
>  - default : run 1 = 55 minutes, run 2 = 59 minutes
>  - balanced : run 1 = 145 minutes, run 2 = 121 minutes
>
> Is that an expected behavior?

Hmm BalancedSegmentMergePolicy was over 2X slower to optimize...?
That's curious.

Though, BSMP is supposed to be better only for the non-optimize case
(ie just adding/deleting docs and never calling optimize).  It tries
to defer picking very large segment merges.

You could also try the TieredMergePolicy (currently still a patch on
http://issues.apache.org/jira/browse/LUCENE-854); it also tries to not
over-merge like LogByteSizeMergePolicy does (sometimes).  I wrote
about this in 
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html


-- 
Mike

http://blog.mikemccandless.com



************************ DISCLAIMER ************************
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not
constitute a formal commitment by Lombard Odier
Darier Hentsch & Cie or any of its branches or affiliates.
If you are not the intended recipient of this message,
kindly notify the sender immediately and destroy this
message. Thank You.
*****************************************************************



-- 
Mike

http://blog.mikemccandless.com


************************ DISCLAIMER ************************
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not
constitute a formal commitment by Lombard Odier
Darier Hentsch & Cie or any of its branches or affiliates.
If you are not the intended recipient of this message,
kindly notify the sender immediately and destroy this
message. Thank You.
*****************************************************************

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message