Dmitry, yeap, you're right Dmitry. Switch on/off compound file would be the trick to simulate the same behavior i described. I did some test on that and found that it working perfect. I think we can leave everything as it is, maybe we should document it somewhere. Does there exists something like a "tips and tricks" section on the lucene website ? Bernhard Dmitry Serebrennikov wrote: > Bernhard Messer wrote: > >> hi developers, >> >> may be there is a small, but effective possibility to optimize the >> SegmentMerger class when compound file option is enabled, which is >> default since lucene 1.4. >> >> The current implementation creates and writes the compound index file >> every time the merge() method is called. Due to the fact, that io >> operations are expensive and time consuming, it would be cool to >> write the compound index file just when optimizing the index. The >> change itself wouldn't be a big deal, adding a boolean parameter to >> SegmenMerger.merge(boolean finalize). Only if finalize==true and >> compound option is enabled, the compound file will be created. To >> fullfill the implementation, the same parameter could be added to >> mergeSegments(int minSegment, boolean finalize) within IndexWriter. >> When mergeSegments is called from flushRamSegments() or >> maybeMergeSegments(), finalize is set to false. Only when called from >> optimize(), finalize will be set to true and the compound file will >> be written. >> >> The dark side will be to explain developers, if they are not >> optimizing the index before closing, compound file option has no >> effect. The other thing is, that we might run into the problem of too >> many open files, which sometimes was reported before the compound >> option was introduced. > > > Yea, that was kind of the point of having the compound files - to > avoid too many file handles, especially during indexing. I hear you on > inefficient use of disk IO, though. > >> >> The negative side could be solved when making the optimization >> optionally available thru IndexWriter. So developers using lucene >> could decide themself if they want to use the "single compound write" >> option or not. > > > One could do that today. Just setUseCompoundFiles(false) during > indexing and call setUseCompoundFiles(true) before the final optimize. > Would that do the trick? Dmitry. > >> >> If wanted and you would like to see the patch, leave me a note and >> i'll create it. >> >> best regards >> Bernhard >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org >> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org