lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Thoughts on CMS and SMS
Date Sat, 29 May 2010 05:06:02 GMT
Yes I got it that you were referring to the Java Executor :).

I disagree about time limiting MS. It may not be useful in many cases,
true. But I have a scenario in which machines are used to perform all
sorts of tasks and the are windows in which I'm allowed to do 'heavy
operations'.

It's true I can just choose not to merge large segments, but I thought
that instead of guessing (even if it'd be an educated guess) which
segments I should pick for different time windows, I'll limit the time
the MS runs. That in addition to not picking large segments for short
time periods.

There are many different scenarios out there Earwin. Some look bizarre
I admit :).

The entity which executes a single merge, today, is IW. Do you think
we need a different entity? For what purpose?

Shai

On Saturday, May 29, 2010, Earwin Burrfoot <earwin@gmail.com> wrote:
> Should I explicitly say I meant java.util.concurrent.Executor? :)
>
> serial, parallel, queue-based, ride-the-caller-thread - all these
> decisions could be made by existing framework, and there's no need to
> invent something.
> So what we /do/ need is an entity that decides on which merges are
> needed at all, an entity that performs a single merge, and
> j.u.c.Executor inbetween.
>
> On an offtopic note - time-limiting merges seems pointless to me.
> Either don't call huge merges, or wait patiently for them to end.
> Unlike half-done search (which can have a purporse), half-done merge
> is time wasted.
>
> 2010/5/29 Shai Erera <serera@gmail.com>:
>> The word executor helped me nail down my thoughts (thanks Earwin). I
>> think we should decouple MS into MergeExecutor and MergeScheduler.
>>
>> ME would control how merges are done: serial, parallel, queue-based
>> (control the level of parallelism) etc.
>>
>> MS would schedule the merges (perhaps a different name is needed). It
>> can be a blocking one (in the sense that the app thread is blocked) or
>> a non-blocking one. It can limit the merges to run a specified amount
>> of time and then abort, etc.
>>
>> I think that all merges block other merges? Anyway, this can be a type
>> of MS too if it isn't the case already.
>>
>> Point is - the decision I make about how should the merges affect my
>> app threads or how long do they take, is not related to how the merges
>> are executed.
>>
>> This will give a nice breakdown of responsibilities between:
>> 1) MP - decides which segments to merge at all
>> 2) ME - executes the merges
>> 3) MS - cotrols all the merges from a high level, mostly the app-level.
>>
>> If we want to consider some further refactoring to IW, we could even
>> tie all three of them together - MS would know of both ME and MP, and
>> IW would interact w/ MS only. This I admit is something that just
>> popped into my mind when writing this email, so perhaps it doesn't
>> make a lot of sense and needs some more mulling.
>>
>> Maybe MS should be renamed to MergeManager or something? Though I'm
>> fine w/ MS too.
>>
>> Shai
>>
>> On Friday, May 28, 2010, Earwin Burrfoot <earwin@gmail.com> wrote:
>>> We just need an Executor-based MS, then we can throw all other out of
>>> the window, as threading concerns are now resolved by a proper choice
>>> of Executor supplied to constructor.
>>> Also an application has much more control over threading in
>>> multiple-index situations, as single Executor can be reused for
>>> multiple MSs.
>>>
>>> On Fri, May 28, 2010 at 18:36, Michael McCandless
>>> <lucene@mikemccandless.com> wrote:
>>>> Hmm... so I think the questions really are "how many merges are
>>>> allowed to run concurrently?" (SMS is 1 and CMS is N), and "do I spawn
>>>> my own threads for merging or do I steal the app's threads" (SMS
>>>> steals app threads and CMS spawns new ones).
>>>>
>>>> Of course if you steal app threads you can only make use of as much
>>>> concurrency as the app's threads...
>>>>
>>>> Both SMS and CMS will allow other indexing ops to proceed, to a point,
>>>> but if the other indexing ops spawn too many merges, then those
>>>> threads will be blocked by both SMS and CMS.
>>>>
>>>> So I'm not sure blocking/non-blocking is a good first split -- even
>>>> SMS isn't blocking other app indexing threads.
>>>>
>>>> Mike
>>>>
>>>> On Thu, May 27, 2010 at 3:58 AM, Shai Erera <serera@gmail.com> wrote:
>>>>> Hi
>>>>>
>>>>> I've been thinking recently why are these two named like they are ...
with a
>>>>> MS we're basically asking two questions: (1) should it block other merges
>>>>> from happening (or app thread from continuing) and (2) should it do its
>>>>> merges concurrently?
>>>>>
>>>>> SMS answers 'true' to (1) and 'false' to (2), while CMS answers the
>>>>> opposite.
>>>>>
>>>>> BUT, there's really no reason why these two are coupled. E.g. someone
who
>>>>> wants to block other merges from running, or hold the app thread until
>>>>> merges are finished, does not necessarily want the merges to run in
>>>>> sequence. Those are two different decisions. Even if you want to block
the
>>>>> application thread and other merges, you can still benefit form having
the
>>>>> merges run concurrently.
>>>>>
>>>>> So, I was thinking that what we really want is a BlockingMS and
>>>>> NonBlockingMS that can be used according to the answer you look for in
(1),
>>>>> and then we can have variants for both that execute the merges concurrently
>>>>> or not. I think that serial

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message