lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Schindler <>
Subject Re: Does forceMerge(1) not always merge to one segment?
Date Mon, 22 May 2017 09:08:52 GMT

The merge policy has to tell IndexWriter, which segments it needs to merge. For upgrading
indexes it just only returns the segments that are old. And those are forceMerged to 1 segment
afterwards. That is just a hack to trigger correct merges in IndexUpgrader. The use of forceMerge
for that is just an implementation detail.

What's your issue with that? IndexUpgrader does what it should: after the upgrade you have
an index with only recent segment versions.


Am 22. Mai 2017 07:39:54 MESZ schrieb Trejkaz <>:
>On Mon, May 22, 2017 at 3:36 PM, Uwe Schindler <> wrote:
>> Hi Trejkaz,
>> yes, it calls forceMerge, but this is just a "trick" to look at each
>segment while merging. But finally it
>> decides on the version number of each segment, if it gets merged as
>part of the forceMerge(1). If the
>> version number of an segment is already on the latest version
>(because the index was already used
>> with 4.10 and new documents were added/updated), it will just remove
>it from the list of the segs to
>> merge. It looks like the index in 4.10.4 already has a lot of
>segments on 4.10 version. When you
>> upgrade it directly after that to 5.5.2, of course it has to merge
>all segments, as all segs are "only
>> on 4.10".
>> The whole trick is in the IndexUpgraderMergePolicy. This one
>implements the algorithm above.
>So a MergePolicy can override the desire of the caller to have a
>single segment? I thought the caller would ultimately get the power of
>veto, but I guess not. :)
>It turns out we were relying on this for a later migration, but I
>guess it might be easier to somehow make it work regardless of the
>number of segments which came out the end. Because all indexes before
>this were created on v3 and the migration goes all the way to v5, it
>took quite a while to run into an index which didn't end up as a
>single segment after both passes... (I guess it would have to have
>over 100 segments?)
>To unsubscribe, e-mail:
>For additional commands, e-mail:

Uwe Schindler
Achterdiek 19, 28357 Bremen
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message