lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Klaas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-65) autoCommit/autoOptimize implementation + multithreaded document adding
Date Tue, 07 Nov 2006 02:40:37 GMT
     [ http://issues.apache.org/jira/browse/SOLR-65?page=all ]

Mike Klaas updated SOLR-65:
---------------------------

    Attachment: autocommit_patch.diff

New patch.

First, the locking semantics actually were wrong.  Since ever addDoc call grabbed the commit
lock and downgraded to access lock, subsequent calls would block on the commit.  I tried a
few vastly different schemes, and it took a while to figure out something that allowed concurrency
but also gave the same protections as before.  I finally settled on using the read/write commit
lock as the principal lock, with a touch of synchronization to protect the addDoc calls.

That finally enabled concurrency, but other bottlenecks emerged.  checkCommit() was grabbing
the commit lock, which created a barrier at the end of every addDoc call which  was forced
to wait for all pending addDoc calls.  Switched to synchro on the tracker (synchronizing on
DUH2 would provoke a potential deadlock).

Finally, there was significant contention on the lock for the logger output stream.  When
merging wasn't occuring, the doc rate could reach 200-300 dps, and each docId was being logged.
 I modified the bulk add code to log the docid of all documents in a single log statement.
 While I was at it, I converted the <result> output for multi-adds to a single xml element.
 Was more information going to be added to this?

The gains of multi-threaded indexing for my application are modest.  The cpu usage is >100%
consistently; it drops a bit during medium merges and drops a lot during large merges (merges
effectively serialize adding documents).  Still, the throughput gain is about 20-30%.  In
retrospect, this isn't terribly surprising, as our analysis is relatively modest.  Applications
with heavier analysis needs would see more gains. 

> autoCommit/autoOptimize implementation + multithreaded document adding
> ----------------------------------------------------------------------
>
>                 Key: SOLR-65
>                 URL: http://issues.apache.org/jira/browse/SOLR-65
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Mike Klaas
>         Assigned To: Mike Klaas
>         Attachments: autocommit_patch.diff, autocommit_patch.diff
>
>
> Basic implementation of autoCommit/autoOptimize functionality, plus overhaul of DUH2
threading to reduce contention

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message