lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-870) add concurrent merge policy
Date Fri, 24 Aug 2007 16:39:31 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-870:
--------------------------------------

    Attachment: LUCENE-870.take2.patch

Attaching patch that provides ConcurrentMergePolicyWrapper using the
"stateless API" approach for MergePolicy.  This must be used with the
patch I just attached to LUCENE-847.

This wrapper can wrap any MergePolicy instance and schedule the
requested merges using background threads, which frees IndexWriter
threads to continue adding/deleting docs.

CMPW accepts a "max thread count" limit: if the number of concurrent
merges needed exceeds this then it just returns the overflow back to
IndexWriter which causes those merges to run in the foreground.

Also in the patch I added 2 test cases to the existing
TestStressIndexing test to use ConcurrentMergePolicyWrapper.

I ran a quick test using this alg:

  analyzer=org.apache.lucene.analysis.SimpleAnalyzer
  doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
  docs.file=/lucene/wikifull.txt
  directory=FSDirectory
  ram.flush.mb = 16
  max.field.length = 2147483647
  doc.add.log.step = 5000
  doc.maker.forever=false

  ResetSystemErase
  CreateIndex
  {AddDoc >: *
  CloseIndex

  RepSumByName

For baseline I used "LogByteSizeMergePolicy". Then, I compared with
the same merge policy, but wrapped using ConcurrentMergePolicyWrapper.

Baseline took 1544 sec to index all of wikipedia; using
ConcurrentMergePolicyWrapper it took 1155 sec (25% speedup), which is
quite sizable.  This is a powerful way to make use of concurrency
without the complexity of having to add threads to your indexing
process.  (This is with JDK 1.5, on a quad core MacPro with 4 drives
in a RAID 0 array).


> add concurrent merge policy
> ---------------------------
>
>                 Key: LUCENE-870
>                 URL: https://issues.apache.org/jira/browse/LUCENE-870
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>            Reporter: Steven Parkes
>            Assignee: Steven Parkes
>         Attachments: CMP.patch.txt, concurrentMerge.patch, LUCENE-870.take2.patch
>
>
> Provide the ability to handle merges in one or more concurrent threads, i.e., concurrent
with other IndexWriter operations.
> I'm factoring the code from LUCENE-847 for this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message