lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <DOR...@il.ibm.com>
Subject flushRamSegments possible perf improvement?
Date Wed, 18 Oct 2006 22:29:26 GMT

Currently IndexWriter.flushRamSegments() always merge all ram segments to
disk. Later it may merge more, depending on the maybe-merge algorithm. This
happens at closing the index and when the number of (1 doc) (ram) segments
exceeds max-buffered-docs.

Can there be a performance penalty for always merging to disk first?

Assume the following merges take place:
  merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs) into
_a (N docs)
  merging segments _6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)

Alternatively, we could tell (compute) that this is going to happen, and
have a single merge:
  merging segments _ram_0 (1 docs) _ram_1 (1 docs) ... _ram_N (1_docs)
                   _6 (M docs) _7 (K docs) _8 (L docs) into _b (N+M+K+L
docs)

This would save writing the segemnt of size N to disk and reading it again.
For large enough N, Is there really potential save here?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message