lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitaly Funstein <vfunst...@gmail.com>
Subject Deferring merging of index segments
Date Wed, 30 May 2012 01:42:03 GMT
Hello,

I am trying to optimize the process of "warming up" an index prior to
using the search subsystem, i.e. it is guaranteed that no other writes
or searches can take place in parallel with with the warmup. To that
end, I have been toying with the idea of turning off segment merging
altogether until after all the data has been written and committed. I
am currently using Lucene 3.0.3 and migration to a later version is
not an option in the short term. So, the way I'm going about turning
merging off is as follows:

1. Before warmup, call:

IndexWriter.setMaxMergeDocs(0);
IndexWriter.getLogMergePolicy().setMaxMergeMB(0);

2. After the warmup task completes, revert the above parameters to
their defaults, then call:

IndexWriter.maybeMerge();
IndexWriter.waitForMerges();


Now, I compared my results when deferring segment merges using the
above method, with a test run letting Lucene do the merging on the
fly. Curiously, the resulting size of indexes on disk is about 64%
greater in the former case, although the total time to complete the
warmup is almost the same.

So I have a few of questions:
- is the approach for deferring segment merging flawed in some way?
- what could possibly account for the huge difference in file sizes?
- what else could I possibly try to further speed up index writing
during system's "off hours"?

Thanks,
-V

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message