lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <>
Subject possible SegmentMerger optimization
Date Sat, 07 Aug 2004 12:12:53 GMT
hi developers,

may be there is a small, but effective possibility to optimize the 
SegmentMerger class when compound file option is enabled, which is 
default since lucene 1.4.

The current implementation creates and writes the compound index file 
every time the merge() method is called. Due to the fact, that io 
operations are expensive and time consuming, it would be cool to write 
the compound index file just when optimizing the index. The change 
itself wouldn't be a big deal, adding a boolean parameter to 
SegmenMerger.merge(boolean finalize). Only if finalize==true and 
compound option is enabled, the compound file will be created. To 
fullfill the implementation, the same parameter could be added to 
mergeSegments(int minSegment, boolean finalize) within IndexWriter. When 
mergeSegments is called from flushRamSegments() or maybeMergeSegments(), 
finalize is set to false. Only when called from optimize(), finalize 
will be set to true and the compound file will be written.

The dark side will be to explain developers, if they are not optimizing 
the index before closing, compound file option has no effect. The other 
thing is, that we might run into the problem of too many open files, 
which sometimes was reported before the compound option was introduced.

The negative side could be solved when making the optimization 
optionally available thru IndexWriter. So developers using lucene could 
decide themself if they want to use the "single compound write" option 
or not.

If wanted and you would like to see the patch, leave me a note and i'll 
create it.

best regards

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message