lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <>
Subject Re: Thoughts on CMS and SMS
Date Sat, 29 May 2010 01:40:54 GMT
Should I explicitly say I meant java.util.concurrent.Executor? :)

serial, parallel, queue-based, ride-the-caller-thread - all these
decisions could be made by existing framework, and there's no need to
invent something.
So what we /do/ need is an entity that decides on which merges are
needed at all, an entity that performs a single merge, and
j.u.c.Executor inbetween.

On an offtopic note - time-limiting merges seems pointless to me.
Either don't call huge merges, or wait patiently for them to end.
Unlike half-done search (which can have a purporse), half-done merge
is time wasted.

2010/5/29 Shai Erera <>:
> The word executor helped me nail down my thoughts (thanks Earwin). I
> think we should decouple MS into MergeExecutor and MergeScheduler.
> ME would control how merges are done: serial, parallel, queue-based
> (control the level of parallelism) etc.
> MS would schedule the merges (perhaps a different name is needed). It
> can be a blocking one (in the sense that the app thread is blocked) or
> a non-blocking one. It can limit the merges to run a specified amount
> of time and then abort, etc.
> I think that all merges block other merges? Anyway, this can be a type
> of MS too if it isn't the case already.
> Point is - the decision I make about how should the merges affect my
> app threads or how long do they take, is not related to how the merges
> are executed.
> This will give a nice breakdown of responsibilities between:
> 1) MP - decides which segments to merge at all
> 2) ME - executes the merges
> 3) MS - cotrols all the merges from a high level, mostly the app-level.
> If we want to consider some further refactoring to IW, we could even
> tie all three of them together - MS would know of both ME and MP, and
> IW would interact w/ MS only. This I admit is something that just
> popped into my mind when writing this email, so perhaps it doesn't
> make a lot of sense and needs some more mulling.
> Maybe MS should be renamed to MergeManager or something? Though I'm
> fine w/ MS too.
> Shai
> On Friday, May 28, 2010, Earwin Burrfoot <> wrote:
>> We just need an Executor-based MS, then we can throw all other out of
>> the window, as threading concerns are now resolved by a proper choice
>> of Executor supplied to constructor.
>> Also an application has much more control over threading in
>> multiple-index situations, as single Executor can be reused for
>> multiple MSs.
>> On Fri, May 28, 2010 at 18:36, Michael McCandless
>> <> wrote:
>>> Hmm... so I think the questions really are "how many merges are
>>> allowed to run concurrently?" (SMS is 1 and CMS is N), and "do I spawn
>>> my own threads for merging or do I steal the app's threads" (SMS
>>> steals app threads and CMS spawns new ones).
>>> Of course if you steal app threads you can only make use of as much
>>> concurrency as the app's threads...
>>> Both SMS and CMS will allow other indexing ops to proceed, to a point,
>>> but if the other indexing ops spawn too many merges, then those
>>> threads will be blocked by both SMS and CMS.
>>> So I'm not sure blocking/non-blocking is a good first split -- even
>>> SMS isn't blocking other app indexing threads.
>>> Mike
>>> On Thu, May 27, 2010 at 3:58 AM, Shai Erera <> wrote:
>>>> Hi
>>>> I've been thinking recently why are these two named like they are ... with
>>>> MS we're basically asking two questions: (1) should it block other merges
>>>> from happening (or app thread from continuing) and (2) should it do its
>>>> merges concurrently?
>>>> SMS answers 'true' to (1) and 'false' to (2), while CMS answers the
>>>> opposite.
>>>> BUT, there's really no reason why these two are coupled. E.g. someone who
>>>> wants to block other merges from running, or hold the app thread until
>>>> merges are finished, does not necessarily want the merges to run in
>>>> sequence. Those are two different decisions. Even if you want to block the
>>>> application thread and other merges, you can still benefit form having the
>>>> merges run concurrently.
>>>> So, I was thinking that what we really want is a BlockingMS and
>>>> NonBlockingMS that can be used according to the answer you look for in (1),
>>>> and then we can have variants for both that execute the merges concurrently
>>>> or not. I think that serial merging should be supported w/ BlockingMS only,
>>>> but am interested in other opinions. One of the scenarios for serial merging
>>>> is if the application wants to ensure no additional threads are spawned
>>>> other than what it decided to spawn, and therefore it can only be used w/
>>>> the BlockingMS.Another scenario is to control IO, but in this case a
>>>> NonBlockingSerialMS may fit as well (depends if you think other merges may
>>>> start while this one is running).
>>>> In fact, w/o changing much, we can have CMS optionally block other merges
>>>> app thread by waiting until all merges are finished. We may even stick w/
>>>> the current SMS/CMS names, just documenting that CMS can be used to block
>>>> threads, only merges will be done concurrently.
>>>> What do you think?
>>>> Shai
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> --
>> Kirill Zakharenko/Кирилл Захаренко (
>> Phone: +7 (495) 683-567-4
>> ICQ: 104465785
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Kirill Zakharenko/Кирилл Захаренко (
Phone: +7 (495) 683-567-4
ICQ: 104465785

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message