lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-870) add concurrent merge policy
Date Fri, 24 Aug 2007 16:39:31 GMT


Michael McCandless updated LUCENE-870:

    Attachment: LUCENE-870.take2.patch

Attaching patch that provides ConcurrentMergePolicyWrapper using the
"stateless API" approach for MergePolicy.  This must be used with the
patch I just attached to LUCENE-847.

This wrapper can wrap any MergePolicy instance and schedule the
requested merges using background threads, which frees IndexWriter
threads to continue adding/deleting docs.

CMPW accepts a "max thread count" limit: if the number of concurrent
merges needed exceeds this then it just returns the overflow back to
IndexWriter which causes those merges to run in the foreground.

Also in the patch I added 2 test cases to the existing
TestStressIndexing test to use ConcurrentMergePolicyWrapper.

I ran a quick test using this alg:

  ram.flush.mb = 16
  max.field.length = 2147483647
  doc.add.log.step = 5000

  {AddDoc >: *


For baseline I used "LogByteSizeMergePolicy". Then, I compared with
the same merge policy, but wrapped using ConcurrentMergePolicyWrapper.

Baseline took 1544 sec to index all of wikipedia; using
ConcurrentMergePolicyWrapper it took 1155 sec (25% speedup), which is
quite sizable.  This is a powerful way to make use of concurrency
without the complexity of having to add threads to your indexing
process.  (This is with JDK 1.5, on a quad core MacPro with 4 drives
in a RAID 0 array).

> add concurrent merge policy
> ---------------------------
>                 Key: LUCENE-870
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Steven Parkes
>            Assignee: Steven Parkes
>         Attachments: CMP.patch.txt, concurrentMerge.patch, LUCENE-870.take2.patch
> Provide the ability to handle merges in one or more concurrent threads, i.e., concurrent
with other IndexWriter operations.
> I'm factoring the code from LUCENE-847 for this.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message