lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dalton, Jeffery" <>
Subject Lucene Merge Algorithm, max number of segments
Date Mon, 06 Mar 2006 21:57:59 GMT
I am just going to wax philosophical for a minute.  I am trying to
understand lucene's merging algorithm in depth.  

Let's say I create an index of 25M web pages on a single machine.  While
creating this index I am doing both search and indexing / re-indexing at
the same time, a bit like Technorati
l), and so I assume a merge factor of 2, as Doug references in the
Technorati post.  According to the algorithm described in Doug's pisa
talk: the average number of
indices when I get near 25M docs is: 
2 * log2(25,000,000) / 2 = 24.6 indices.

According to the FAQ,,
there is no way to set a hard limit on the number of indices.  It seems
to me that it might be nice to be able to impose a hard limit on the
number of segments, in this case, perhaps 5 or 10 indices.  

Does anyone have any experience with this?  Has anyone tried imposing a
hard limit?  


- Jeff

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message