lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Herold" <pher...@d-wise.com>
Subject index size doubling / optimization (Lucene 3.0.3)
Date Wed, 09 Feb 2011 18:58:00 GMT
I know that the size of a Lucene index can double while optimization is
underway, but it's supposed to eventually settle back down to the original
size, correct? We have a Lucene index consisting of 100K documents, that is
normally about 12GB in size. It is split across 10 sub-indexes which we
search using MultiSearcher. It takes our system about 7 hours to traverse
the file system and update the index, which typically adds, updates or
deletes anywhere from a dozen to a few hundred documents. We optimize each
sub-index at the end (although this is configurable). The system seems to
run fine for several days, with the total size of the index staying fairly
consistent, then all of the sudden the index size doubles to about 25GB, and
stays there. I'm assuming this is happening after an optimization-there is
certainly not a doubling of the data that is being added.

 

Is this expected or known behavior, or a bug of some kind?

 

I've read various postings on the 'net regarding optimization, and when to
do it, if at all, and I'm certainly open to other strategies. Search time is
critical for our users. 

 

FWIW, we have the following tunable parameters configured for our index:

 

mergeFactor: 5

maxMergeDocs: 1000

maxBufferedDocs: 200

RAMBufferSizeMB: 16

 

Any advice or help is appreciated. 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message