lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitaly Funstein <vfunst...@gmail.com>
Subject Re: Concurrent Indexing
Date Fri, 20 Jun 2014 18:39:22 GMT
You could just avoid calling commit() altogether if your application's
semantics allow this (i.e. it's non-transactional in nature). This way,
Lucene will do commits when appropriate, based on the buffering settings
you chose. It's generally unnecessary and undesirable to call commit at the
end of each write, unless you see to provide strict durability guarantees
in your system.

If you must acknowledge every write after it's been committed, set up a
single committer thread that does this when there are any work tasks in the
queue. Then add to that queue from your writer threads...


On Fri, Jun 20, 2014 at 8:47 AM, Umashanker, Srividhya <
srividhya.umashanker@hp.com> wrote:

> Lucene Experts -
>
> Recently we upgraded to Lucene 4. We want to make use of concurrent
> flushing feature Of Lucene.
>
> Indexing for us includes certain db operations and writing to lucene ended
> by commit.  There may be multiple concurrent calls to Indexer to publish
> single/multiple records.
>
> So far, with older version of lucene, we had our indexing synchronized (1
> thread indexing).
> Which means waiting time is more, based on concurrency and execution time.
>
> We are moving away from the Synchronized indexing. Which is actually to
> cut down the waiting period.  Trying to find out if we have to limit the
> number of threads that adds document and commits.
>
> Below are the tests - to publish just 1000 records with 3 text fields.
>
> Java 7 , JVM config :  -XX:MaxPermSize=384M
>  -XX:+HeapDumpOnOutOfMemoryError  -Xmx400m -Xms50m -XX:MaxNewSize=100m
> -Xss256k -XX:-UseParallelOldGC -XX:-UseSplitVerifier
> -Djsse.enableSNIExtension=false
>
> IndexConfiguration being default : We also tried with changes in
> maxThreadStates,maxBufferedDocs,ramBufferSizeMB - no impact.
>
>
>
> Min time  in ms
>
> Max time ms
>
> Avg time ms
>
> 1 thread -commit
>
> 65
>
> 267
>
> 85
>
> 1 thread -updateDocument
>
> 0
>
> 40
>
> 1
>
>
>
>
>
>
>
>
>
> 6 thread-commit
>
> 83
>
> 1449
>
> 552.42
>
> 6 thread- updateDocument
>
> 0
>
> 175
>
> 1.5
>
>
>
>
>
>
>
>
>
> 10 thread -Commit
>
> 154
>
> 2429
>
> 874
>
> 10 thread- updateDocument
>
> 0
>
> 243
>
> 1.9
>
>
>
>
>
>
>
>
>
> 20 thread -commit
>
> 76
>
> 4351
>
> 1622
>
> 20 thread - updateDocument
>
> 0
>
> 326
>
> 2.1
>
>
>
>
>
>
>
>
>
>
> More the threads trying to write to lucene, the updateDocument and
> commit() are becoming bottlenecks.  In the above table, 10 and 20 threads
>  have an average of 1.5 sec for 1000 commits.
>
> Is there some configuration of suggestions to tune the performance of the
> 2 methods, so that our service performs better, with more concurrency?
>
> -vidhya
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message