lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Sturlese <marc.sturl...@gmail.com>
Subject lucene 4.0 and DocumentsWriterPerThreadPool compared to lucene 3.4
Date Tue, 11 Oct 2011 12:14:23 GMT
I'm doing some performance test doing bulk indexing with lucene 4.0 and I'm
seeing weird results. I've read
http://www.gossamer-threads.com/lists/lucene/java-dev/127190?do=post_view_threaded#127190
but I'm still having doubts.
I'm building an index of 1G containing 1 milion docs. When building the
index, never search on it. I'm doing it with 1000 java heap, dual core and
ssd disk laptop
Using this conf:
tieredMergePolicy
lucene_34
not optimizing and commiting just in the end
maxMergeAtOnce = 10
segmentsPerTier = 10
It's taking 6min.


Using:
tieredMergePolicy
lucene_40
not optimizing and commiting just in the end
maxMergeAtOnce = 10
segmentsPerTier = 10
DEFAULT_MAX_THREAD_STATES = 8 in DocumentsWriterPerThreadPool
It's taking 20min.

If I change the default DEFAULT_MAX_THREAD_STATES to 4 or even 1 I'm getting
almost the same result.
I thought setting DEFAULT_MAX_THREAD_STATES = 1 would emulate the "old"
lucene indexing behabiour.
I might be doing something wrong because the three indexs buit with 4.0
should have different number of segments (because of the different
DEFAULT_MAX_THREAD_STATES) but the thing is they don't.

Is that normal? Any clue what could be wrong? (my trunk is from yesterday)
Thanks in advance.


--
View this message in context: http://lucene.472066.n3.nabble.com/lucene-4-0-and-DocumentsWriterPerThreadPool-compared-to-lucene-3-4-tp3412388p3412388.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message