lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-2755) Some improvements to CMS
Date Thu, 11 Nov 2010 13:46:14 GMT
Some improvements to CMS
------------------------

                 Key: LUCENE-2755
                 URL: https://issues.apache.org/jira/browse/LUCENE-2755
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Shai Erera
            Assignee: Shai Erera
            Priority: Minor
             Fix For: 3.1, 4.0


While running optimize on a large index, I've noticed several things that got me to read CMS
code more carefully, and find these issues:

* CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads taking
merges from the IndexWriter until they are exhausted, and only then that blocked merge will
run. I think it's unnecessary that that merge will be blocked.

* CMS sorts merges by segments size, doc-based and not bytes-based. Since the default MP is
LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore, I think
we should switch the default impl. There are two ways to make it extensible, if we want:
** Have an overridable member/method in CMS that you can extend and override - easy.
** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs, calibrate
deletes etc.). Better, but will need to tap into several places in the code, so more risky
and complicated.

On the go, I'd like to add some documentation to CMS - it's not very easy to read and follow.

I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message