lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-982) Create new method optimize(int maxNumSegments) in IndexWriter
Date Mon, 19 Nov 2007 11:45:43 GMT


Michael McCandless updated LUCENE-982:

    Attachment: LUCENE-982.patch

Attached patch to implement new optimize(int maxNumSegments).

I fixed LogMergePolicy to respect the maxNumSegments arg.  When
optimizing, we first concurrently do every mergeFactor section of
segment merges (going back from the tail).

Then, for the final partial (< mergeFactor segments) merge, we pick
contiguous segments that are the smallest net size, as long as the
resulting merged segment is not > 2X larger than the segment to its
left (to prevent creating a lopsided index over time).

> Create new method optimize(int maxNumSegments) in IndexWriter
> -------------------------------------------------------------
>                 Key: LUCENE-982
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>         Attachments: LUCENE-982.patch
> Spinning this out from the discussion in LUCENE-847.
> I think having a way to "slightly optimize" your index would be useful
> for many applications.
> The current optimize() call is very expensive for large indices
> because it always optimizes fully down to 1 segment.  If we add a new
> method which instead is allowed to stop optimizing once it has <=
> maxNumSegments segments in the index, this would allow applications to
> eg optimize down to say <= 10 segments after doing a bunch of updates.
> This should be a nice compromise of gaining good speedups of searching
> while not spending the full (and typically very high) cost of
> optimizing down to a single segment.
> Since LUCENE-847 is now formalizing an API for decoupling merge policy
> from IndexWriter, if we want to add this new optimize method we need
> to take it into account in LUCENE-847.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message