lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: IndexWriter.optimize and memory usage
Date Fri, 03 Dec 2004 07:43:07 GMT
On Friday 03 December 2004 07:50, Chris Hostetter wrote:
> 
> : See IndexWriter javadoc and in particular mergeFactor, minMergeDocs,
> : and maxMergeDocs.  This will let you control the size of your segments,
> : the frequency of segment merges, the amount of buffered Documents in
> : RAM between segment merges and such.  Also, you ask about calling
> 
> Yeah, I'm familiar with those options.  I initially tried just using the
> defaults, and then I tried using a high mergeFactor (100 if i remember
> right) -- both of which had the same result: index built fine, but the
> optimize call at the end ran out of memory.
> 
> : optimize periodically - no need, Lucene should already merge segments
> : once in a while for you.  Optimize at the end.  You can also experiment
> 
> So, If I'm understanding you (and the javadocs) correctly, the real key
> here is maxMergeDocs.  It seems like addDocument will never merge a
> segment untill maxMergeDocs have been added? ... meaning that I need a
> value less then the default (Integer.MAX_VALUE) if I want IndexWriter to
> do incrimental merges as I go ...
> 
> 	...except...
> 
> ...if that were the case, then exactly is the meaning of mergeFactor?

maxMergeDocs controls the sizes of the intermediate segments
when adding documents.
With maxMergeDocs at default, adding a document can take as much time as
(and have the same effect as) optimize.  Eg. with minMergeFactor at 10, the
1000'th added document will create a segment of size 1000.
With maxMergeDocs at a lower value than 1000, the last merge (of the 10
segments with 100 docs each) will not be done.

optimize() uses minMergeDocs for its final merges, but it ignores
maxMergeDocs. 

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message