lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: require new comment for IndexWriter.mergeFactor
Date Sun, 06 Nov 2005 08:32:40 GMT
I think the best description of various IndexWriter parameters is in
Lucene in Action (sorry if this sounds like a plug, but it's not the
intention) in section 2.7.1., page 42 (
http://www.lucenebook.com/search?query=mergeFactor )

mergeFactor controls how often Lucene merges segments while indexing.
minMergeDocs (I think it's renamed now) controls how many documents to
buffer in RAM before writing them to disk.

Only the later affects RAM usage.  Both affect speed (check out free
code from Lucene in Action at http://lucenebook.com/ to see how
changing various parameters changes indexing performance).

Otis


--- Kerang Lv <lvkrnewer@yahoo.com> wrote:

> Does the IndexWriter.mergeFactor remain the same
> effect on the RAM use after the introduce of
> IndexWriter.minMergeDocs?
> 
> The minMergeDocs was introduced into
> IndexWriter(Revision 1.21 in cvs) in order to control
> the number of
> Documents merged in RAMDirectory independently of the
> mergeFactor (see
> http://issues.apache.org/bugzilla/show_bug.cgi?id=23754).
> And the IndexWriter.maybeMergeSegments changed from
> then on:
> 
> @@ -375,7 +385,7 @@
>  
>    /** Incremental segment merger.  */
>    private final void maybeMergeSegments() throws
> IOException {
> -    long targetMergeDocs = mergeFactor;
> +    long targetMergeDocs = minMergeDocs;
> 
> But the comment of mergeFactor remains:
> 
> The following is the comment of
> IndexWriter.mergeFactor in the 1.2 RC6:
>   /** Determines how often segment indexes are merged
> by addDocument().  With
>    * smaller values, less RAM is used while indexing,
> and searches on
>    * unoptimized indexes are faster, but indexing
> speed is slower.  With larger
>    * values more RAM is used while indexing and
> searches on unoptimized indexes
>    * are slower, but indexing is faster.  Thus larger
> values (> 10) are best
>    * for batched index creation, and smaller values (<
> 10) for indexes that are
>    * interactively maintained.
>    *
>    * <p>This must never be less than 2.  The default
> value is 10.*/
>   public int mergeFactor = 10;
> 
> 
> and now, it's in 1.4.3:
>   /** Determines how often segment indices are merged
> by addDocument().  With
>    * smaller values, less RAM is used while indexing,
> and searches on
>    * unoptimized indices are faster, but indexing
> speed is slower.  With larger
>    * values, more RAM is used during indexing, and
> while searches on unoptimized
>    * indices are slower, indexing is faster.  Thus
> larger values (> 10) are best
>    * for batch index creation, and smaller values (<
> 10) for indices that are
>    * interactively maintained.
>    *
>    * <p>This must never be less than 2.  The default
> value is 10.*/
>   public int mergeFactor = DEFAULT_MERGE_FACTOR;
> 
> Does the IndexWriter.mergeFactor remain the same
> effect on the RAM use?
> 
> 
> 		
> __________________________________ 
> Start your day with Yahoo! - Make it your home page! 
> http://www.yahoo.com/r/hs
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message