lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Way to repair an index broking during 1/2 optimize?
Date Thu, 08 Jul 2004 23:03:05 GMT
Kevin A. Burton wrote:
> During an optimize I assume Lucene starts writing to a new segment and 
> leaves all others in place until everything is done and THEN deletes them?

That's correct.

> The only settings I uses are:
> 
> targetIndex.mergeFactor=10;
> targetIndex.minMergeDocs=1000;
> 
> the resulting index has 230k files in it :-/

Something sounds very wrong for there to be that many files.

The maximum number of files should be around:

   (7 + numIndexedFields) * (mergeFactor-1) * 
(log_base_mergeFactor(numDocs/minMergeDocs))

With 14M documents, log_10(14M/1000) is 4, which gives, for you:

   (7 + numIndexedFields) * 36 = 230k
    7*36 + numIndexedFields*36 = 230k
    numIndexedFields = (230k - 7*36) / 36 =~ 6k

So you'd have to have around 6k unique field names to get 230k files. 
Or something else must be wrong.  Are you running on win32, where file 
deletion can be difficult?

With the typical handful of fields, one should never see more than 
hundreds of files.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message