lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject Re: Thoughts on CMS and SMS
Date Fri, 28 May 2010 20:11:49 GMT
The word executor helped me nail down my thoughts (thanks Earwin). I
think we should decouple MS into MergeExecutor and MergeScheduler.

ME would control how merges are done: serial, parallel, queue-based
(control the level of parallelism) etc.

MS would schedule the merges (perhaps a different name is needed). It
can be a blocking one (in the sense that the app thread is blocked) or
a non-blocking one. It can limit the merges to run a specified amount
of time and then abort, etc.

I think that all merges block other merges? Anyway, this can be a type
of MS too if it isn't the case already.

Point is - the decision I make about how should the merges affect my
app threads or how long do they take, is not related to how the merges
are executed.

This will give a nice breakdown of responsibilities between:
1) MP - decides which segments to merge at all
2) ME - executes the merges
3) MS - cotrols all the merges from a high level, mostly the app-level.

If we want to consider some further refactoring to IW, we could even
tie all three of them together - MS would know of both ME and MP, and
IW would interact w/ MS only. This I admit is something that just
popped into my mind when writing this email, so perhaps it doesn't
make a lot of sense and needs some more mulling.

Maybe MS should be renamed to MergeManager or something? Though I'm
fine w/ MS too.


On Friday, May 28, 2010, Earwin Burrfoot <> wrote:
> We just need an Executor-based MS, then we can throw all other out of
> the window, as threading concerns are now resolved by a proper choice
> of Executor supplied to constructor.
> Also an application has much more control over threading in
> multiple-index situations, as single Executor can be reused for
> multiple MSs.
> On Fri, May 28, 2010 at 18:36, Michael McCandless
> <> wrote:
>> Hmm... so I think the questions really are "how many merges are
>> allowed to run concurrently?" (SMS is 1 and CMS is N), and "do I spawn
>> my own threads for merging or do I steal the app's threads" (SMS
>> steals app threads and CMS spawns new ones).
>> Of course if you steal app threads you can only make use of as much
>> concurrency as the app's threads...
>> Both SMS and CMS will allow other indexing ops to proceed, to a point,
>> but if the other indexing ops spawn too many merges, then those
>> threads will be blocked by both SMS and CMS.
>> So I'm not sure blocking/non-blocking is a good first split -- even
>> SMS isn't blocking other app indexing threads.
>> Mike
>> On Thu, May 27, 2010 at 3:58 AM, Shai Erera <> wrote:
>>> Hi
>>> I've been thinking recently why are these two named like they are ... with a
>>> MS we're basically asking two questions: (1) should it block other merges
>>> from happening (or app thread from continuing) and (2) should it do its
>>> merges concurrently?
>>> SMS answers 'true' to (1) and 'false' to (2), while CMS answers the
>>> opposite.
>>> BUT, there's really no reason why these two are coupled. E.g. someone who
>>> wants to block other merges from running, or hold the app thread until
>>> merges are finished, does not necessarily want the merges to run in
>>> sequence. Those are two different decisions. Even if you want to block the
>>> application thread and other merges, you can still benefit form having the
>>> merges run concurrently.
>>> So, I was thinking that what we really want is a BlockingMS and
>>> NonBlockingMS that can be used according to the answer you look for in (1),
>>> and then we can have variants for both that execute the merges concurrently
>>> or not. I think that serial merging should be supported w/ BlockingMS only,
>>> but am interested in other opinions. One of the scenarios for serial merging
>>> is if the application wants to ensure no additional threads are spawned
>>> other than what it decided to spawn, and therefore it can only be used w/
>>> the BlockingMS.Another scenario is to control IO, but in this case a
>>> NonBlockingSerialMS may fit as well (depends if you think other merges may
>>> start while this one is running).
>>> In fact, w/o changing much, we can have CMS optionally block other merges /
>>> app thread by waiting until all merges are finished. We may even stick w/
>>> the current SMS/CMS names, just documenting that CMS can be used to block
>>> threads, only merges will be done concurrently.
>>> What do you think?
>>> Shai
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> --
> Kirill Zakharenko/Кирилл Захаренко (
> Phone: +7 (495) 683-567-4
> ICQ: 104465785
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message