lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2755) Some improvements to CMS
Date Tue, 16 Nov 2010 17:10:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932539#action_12932539
] 

Michael McCandless commented on LUCENE-2755:
--------------------------------------------


{quote}
bq. Ideally only IW.merge should call it (and it becomes private),

I wouldn't make it private. If I remember correctly, the Parallel Index overrode that method
to synchronize merges across all parallels.
{quote}

Ahh OK.

bq. But Mike, if you hit your maxMergeCount with large merges, then you won't run tiny merges
at all.

Sure, but that's uncommon.  Ie, large merges don't happen very
frequently.

bq. It's only if you have room to run any merges, that this 'pausing' actually helps. I trust
you when you say you've observed that not pausing those merges hurt performance, but I wonder
in real life, how often does that happen, and whether we should incorporate that in our code.
If it's a rare case, then perhaps apps that hit it should use another MS which pauses its
threads?

Remember it's not just pausing.  We also set thread priorities so that
smaller merges run with higher priority, and, all merges run with
higher priority than the indexing threads (by default).

I don't think this is rare because eventually (assuming your index is
big enough) you'll hit a large merge and then you can fairly easily
see the merges stack up.  I've seen merges stack up in the non-NRT
case too.  Without this explicit thread scheduling we do, that large
merge can easily kill your NRT reopens, ie take many seconds to get a
new reader.  This is non-graceful degradation because at first NRT
reopen time looks great but then as your index grows and you hit a
large merge, suddenly it's many seconds.

If your app has costly merges (eg you store fields, term vectors, and
you use dynamic fields which means the stores cannot be bulk merged),
and you're not on an SSD, and your OS is memory starved so it can't do
as much readahead as it should be doing, your merges become far more
costly.  Worse, the default merge thread count (3) may in fact be too high
for most machines even with 4 or more cores.  There are many variables...

The scheduling can only do so much, of course.  Ie it enables us to
soak up the "spare" CPU cycles in between medium, little merges to let
the bit merge make progress.  But if those spare cycles aren't enough
then inevitably the best scheduling will still have to eventually
pause your reopens.

Still I think the other improvements we've talked about here would be
great steps forward.  It's just that we still need to explicitly
schedule the merge threads.


> Some improvements to CMS
> ------------------------
>
>                 Key: LUCENE-2755
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2755
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>
> While running optimize on a large index, I've noticed several things that got me to read
CMS code more carefully, and find these issues:
> * CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads
taking merges from the IndexWriter until they are exhausted, and only then that blocked merge
will run. I think it's unnecessary that that merge will be blocked.
> * CMS sorts merges by segments size, doc-based and not bytes-based. Since the default
MP is LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore,
I think we should switch the default impl. There are two ways to make it extensible, if we
want:
> ** Have an overridable member/method in CMS that you can extend and override - easy.
> ** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs,
calibrate deletes etc.). Better, but will need to tap into several places in the code, so
more risky and complicated.
> On the go, I'd like to add some documentation to CMS - it's not very easy to read and
follow.
> I'll work on a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message