lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] Updated: (LUCENE-2701) Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
Date Thu, 14 Oct 2010 09:29:32 GMT


Shai Erera updated LUCENE-2701:

    Attachment: LUCENE-2701.patch

Patch adds maxMergeMB handling to optimize as well. If there are no segments exceeding the
threshold, then only maxNumSegments constraint is taken into account. Basically I've created
two private methods findMergesForOptimizeMaxMergeSize and findMergesForOptimizeMaxNumSegments
(the original logic). findMergesForOptimize calls the relevant one.

I've also changed some members to protected and methods as well, for really easy extension
of LMP. As a result, I removed two methods from BalancedSegmentsMP that were copied over from

I took the opportunity to change OneMerge.segments and userCompoundfile to public - they are
final so no risk of changing from the outside. But otherwise, if you would like to write a
MP which queries the OneMerge objects, you can't. I added totalSize() to return the total
size in bytes of that merge.

Test + CHANGES entry as well.

> Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
> ----------------------------------------------------------------
>                 Key: LUCENE-2701
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2701.patch
> LogMergePolicy allows you to specify a maxMergeSize in MB, which is taken into consideration
in regular merges, yet ignored by findMergesForOptimze. I think it'd be good if we take that
into consideration even when optimizing. This will allow the caller to specify two constraints:
maxNumSegments and maxMergeMB. Obviously both may not be satisfied, and therefore we will
guarantee that if there is any segment above the threshold, the threshold constraint takes
precedence and therefore you may end up w/ <maxNumSegments (if it's not 1) after optimize.
Otherwise, maxNumSegments is taken into consideration.
> As part of this change, I plan to change some methods to protected (from private) and
members as well. I realized that if one wishes to implement his own LMP extension, he needs
to either put it under o.a.l.index or copy some code over to his impl.
> I'll attach a patch shortly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message