lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-847) Factor merge policy out of IndexWriter
Date Mon, 10 Sep 2007 23:20:31 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-847:
--------------------------------------

    Attachment: LUCENE-847.take6.patch

OK, another rev of the patch (take6).  I think it's close!

This patch passes all unit tests with SerialMergeScheduler (left as
the default for now) and also passes all unit tests once you switch
the default to ConcurrentMergeScheduler instead.

I made one simplification to the approach: IndexWriter now keeps track
of "pendingMerges" (merges that mergePolicy has declared are necessary
but have not yet been started), and "runningMerges" (merges currently
in flight).  Then MergeScheduler just asks IndexWriter for the next
pending merge when it's ready to run it.  This also cleaned up how
cascading works.

Other changes:

  * Optimize: optimize is now fully concurrent (it can run multiple
    merges at once, new segments can be flushed during an optimize,
    etc).  Optimize will optimize only those segments present when it
    started (newly flushed segments may remain separate).

  * New API: optimize(boolean doWait) allows you to not wait for
    optimize to complete (it runs in background).  This only works
    when MergeScheduler uses threads.

  * New API: close(boolean doWait) allows you to not wait for running
    merges if you want to "close in a hurry".  Also only works when
    MergeScheduler uses threads.

  * I fixed LogMergePolicy to expose merge concurrency during optimize
    by first calling the "normal" merge policy to see if it requires
    merges and returning those merges if so, and then falling back to
    the normal "merge the tail <= mergeFactor segments until there is
    only 1 left".

  * Because IndexModifier synchronizes on directory, it can't use
    ConcurrentMergeScheduler since this quickly leads to deadlock at
    least during IndexWriter.close.  So I set it back to
    SerialMergeScheduler (it is deprecated anyway).

  * Added private IndexWriter.message(...) that prints message to the
    infoStream prefixed by the thread name and changed all
    infoStream.print*'s to message(...).  Also added more messages in
    the exceptional cases to aid future diagnostics.

  * Added more unit tests


> Factor merge policy out of IndexWriter
> --------------------------------------
>
>                 Key: LUCENE-847
>                 URL: https://issues.apache.org/jira/browse/LUCENE-847
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Steven Parkes
>            Assignee: Steven Parkes
>             Fix For: 2.3
>
>         Attachments: concurrentMerge.patch, LUCENE-847.patch.txt, LUCENE-847.patch.txt,
LUCENE-847.take3.patch, LUCENE-847.take4.patch, LUCENE-847.take5.patch, LUCENE-847.take6.patch,
LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, making it
possible for apps to choose a custom merge policy and for easier experimenting with merge
policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message