aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: [PROPOSAL] Disallow instance removal in job update
Date Mon, 08 Feb 2016 17:42:29 GMT
> Or without any persistence at all.  The client could refuse to adjust the
> instance count on a job unless there's additional command line argument.
> The same arguments of responsibility could be said here of users of old
> clients or custom clients.

Bill, are you suggesting 'aurora update start' client command call a
scheduler to acquire an update diff first and block startJobUpdate RPC
call unless a special command line flag is present?

> When updating a job, the scheduler would fill in the current instance count.
> However, when I want to change the number of instances, I could simply
> bind another value locally when triggering the update.

Stephan, this sounds like increasing instances would also require a
binding helper, which makes an update process less deterministic (i.e.
.aurora config file is no longer self-contained).

On Sun, Feb 7, 2016 at 3:02 PM, Erb, Stephan
<Stephan.Erb@blue-yonder.com> wrote:
> A related idea that recently crossed my mind was some kind of pystachio variable / binding
helper:  {{aurora.instances}}.
>
> When updating a job, the scheduler would fill in the current instance count. However,
when I want to change the number of instances, I could simply bind another value locally when
triggering the update.
> ________________________________________
> From: Maxim Khutornenko <maxim@apache.org>
> Sent: Saturday, February 6, 2016 00:07
> To: dev@aurora.apache.org
> Subject: Re: [PROPOSAL] Disallow instance removal in job update
>
> We have had attempts to safeguard client updater command with a
> "dangerous change" warning before but it did not get good feedback.
> Besides, automated tools/scripts just ignored it.
>
> An alternative could be what George suggest on the scaling API thread
> mentioned earlier: automatically bump up instance count to the job
> active task count. I'd say this could be an implementation to the
> proposal above rather than a safeguard as it accomplishes the exact
> same goal.
>
> Bill, do you have any ideas of what that safeguard could be?
>
> On Fri, Feb 5, 2016 at 2:56 PM, Bill Farner <wfarner@apache.org> wrote:
>>>
>>> the outdated instance count problem will only get worse as automated
>>> scaling tools will quickly render existing .aurora config value obsolete
>>
>>
>> This is not a compelling reason to remove functionality.  Sounds like a
>> safeguard is needed instead.
>>
>> On Fri, Feb 5, 2016 at 2:43 PM, Maxim Khutornenko <maxim@apache.org> wrote:
>>
>>> This is mostly a survey rather than a proposal. How would people think
>>> about limiting updater to only adding/updating instances and let
>>> killTasks take care of instance removals?
>>>
>>> We have all heard stories (or happen to create some ourselves) when an
>>> outdated instance count value in .aurora config caused unexpected
>>> instance removals. Granted, there are plenty of other values in the
>>> config that can cause service-wide outage but instance count seems to
>>> be the worst in that sense.
>>>
>>> After the recent refactoring of addInstances and killTasks to act as
>>> scaleOut/scaleIn APIs [1], the outdated instance count problem will
>>> only get worse as automated scaling tools will quickly render existing
>>> .aurora config value obsolete. With that in mind, should we block
>>> instance removal in the updater and let an explicit killTasks call be
>>> the only acceptable action to reduce instance count? Is there any
>>> value (aside from arguable convenience factor) in having
>>> startJobUpdate ever killing instances?
>>>
>>> Thanks,
>>> Maxim
>>>
>>> [1] - http://markmail.org/message/2smaej5n5e54li3g
>>>

Mime
View raw message