lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tate Avery" <tate.av...@nstein.com>
Subject RE: Lucene on Windows
Date Tue, 21 Oct 2003 17:49:02 GMT
Doug,

Re: high merge factor.  I was building test indexes and writing out 300 segments of 300 docs
and merging them every 90,000 kept the 'merging' time down to a minimum (for my slowish HD).

I was assuming that 11 of these large merges during the indexing of 1,000,000 docs (plus a
final optimize) would be faster than 10,000 little merges if the mergeFactor was set to 10
(for the same corpus).

Maybe this is not the case.




Tate


-----Original Message-----
From: Doug Cutting [mailto:cutting@lucene.com]
Sent: October 21, 2003 12:37 PM
To: Lucene Users List
Subject: Re: Lucene on Windows


Tate Avery wrote:
> You might have trouble with "too many open files" if you set your mergeFactor too high.
 For example, on my Win2k, I can go up to mergeFactor=300 (or so).  At 400 I get a too many
open files error.  Note: the default mergeFactor of 10 should give no trouble.

Please note that it is never recommended that you set mergeFactor 
anywhere near this high.  I don't know why folks do this.  It really 
doesn't make indexing much faster, and it makes searching slower if you 
don't optimize.  It's a bad idea.  The default setting of 10 works 
pretty well.  I've also had good experience setting it as high as 50 on 
big batch indexing runs, but do not recommend setting it much higher 
than that.  Even then, this can cause problems if you need to use 
several indexes at once, or you have lots of fields.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message