lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 34930] - IndexWriter.maybeMergeSegments() takes lots of CPU resources
Date Mon, 16 May 2005 16:11:32 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=34930>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=34930





------- Additional Comments From yseeley@gmail.com  2005-05-16 18:11 -------
(In reply to comment #3)
> Isn't the effect the same as setting mergeFactor to 2000, i.e. indexing gets  
> faster but more RAM is needed?  

On every add, it looks like the entire segment list is walked looking to see of
enough docs (minMergeDocs) can be collected together to do a merge.  With a
large minMergeDocs, this can get expensive.  Perhaps a count should be kept of
the number of docs in memory, and when it exceeds minMergeDocs, then call the
merge logic.

Of course keeping track of the count would require modifications to many
IndexWriter methods. It looks like the performance gains may well be worth it
though.

BTW, I don't think a mergefactor of 1000 is typical (way too many
filedescriptors in use, and a big hit to searchers).  A high minMergeDocs (now
maxBufferedDocs) *is* typical and useful though.



-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message