lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] Updated: (LUCENE-2701) Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
Date Fri, 15 Oct 2010 13:22:32 GMT


Shai Erera updated LUCENE-2701:

    Attachment: LUCENE-2701.patch

You're right about the code - the 'else if' is in case there is one not optimized segment
to the right. Added a comment and combined them into one OR-ed if. Also added a test case.

OneMerge.totalSizeInBytes -- no one calls it now, but I would like to write a MP which will,
and remove merges that exceed a specified total size. It's just a service method, instead
of you needing to write it on your own. I renamed it to totalBytesSize. And on the way added
totalNumDocs, doing the same for the number of docs.

bq. Maybe note somewhere that now optimize (when there's a maxMergeDocs/MB constraint) is
able to merge fewer than mergeFactor segments at a time?

Wasn't it able to do so even before? E.g. if maxNumSegments < numSegments < mergeFactor?

> Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
> ----------------------------------------------------------------
>                 Key: LUCENE-2701
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2701.patch, LUCENE-2701.patch, LUCENE-2701.patch
> LogMergePolicy allows you to specify a maxMergeSize in MB, which is taken into consideration
in regular merges, yet ignored by findMergesForOptimze. I think it'd be good if we take that
into consideration even when optimizing. This will allow the caller to specify two constraints:
maxNumSegments and maxMergeMB. Obviously both may not be satisfied, and therefore we will
guarantee that if there is any segment above the threshold, the threshold constraint takes
precedence and therefore you may end up w/ <maxNumSegments (if it's not 1) after optimize.
Otherwise, maxNumSegments is taken into consideration.
> As part of this change, I plan to change some methods to protected (from private) and
members as well. I realized that if one wishes to implement his own LMP extension, he needs
to either put it under o.a.l.index or copy some code over to his impl.
> I'll attach a patch shortly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message