aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erb, Stephan" <Stephan....@blue-yonder.com>
Subject Re: [PROPOSAL] Disallow instance removal in job update
Date Sun, 07 Feb 2016 23:02:16 GMT
A related idea that recently crossed my mind was some kind of pystachio variable / binding
helper:  {{aurora.instances}}.

When updating a job, the scheduler would fill in the current instance count. However, when
I want to change the number of instances, I could simply bind another value locally when triggering
the update.
________________________________________
From: Maxim Khutornenko <maxim@apache.org>
Sent: Saturday, February 6, 2016 00:07
To: dev@aurora.apache.org
Subject: Re: [PROPOSAL] Disallow instance removal in job update

We have had attempts to safeguard client updater command with a
"dangerous change" warning before but it did not get good feedback.
Besides, automated tools/scripts just ignored it.

An alternative could be what George suggest on the scaling API thread
mentioned earlier: automatically bump up instance count to the job
active task count. I'd say this could be an implementation to the
proposal above rather than a safeguard as it accomplishes the exact
same goal.

Bill, do you have any ideas of what that safeguard could be?

On Fri, Feb 5, 2016 at 2:56 PM, Bill Farner <wfarner@apache.org> wrote:
>>
>> the outdated instance count problem will only get worse as automated
>> scaling tools will quickly render existing .aurora config value obsolete
>
>
> This is not a compelling reason to remove functionality.  Sounds like a
> safeguard is needed instead.
>
> On Fri, Feb 5, 2016 at 2:43 PM, Maxim Khutornenko <maxim@apache.org> wrote:
>
>> This is mostly a survey rather than a proposal. How would people think
>> about limiting updater to only adding/updating instances and let
>> killTasks take care of instance removals?
>>
>> We have all heard stories (or happen to create some ourselves) when an
>> outdated instance count value in .aurora config caused unexpected
>> instance removals. Granted, there are plenty of other values in the
>> config that can cause service-wide outage but instance count seems to
>> be the worst in that sense.
>>
>> After the recent refactoring of addInstances and killTasks to act as
>> scaleOut/scaleIn APIs [1], the outdated instance count problem will
>> only get worse as automated scaling tools will quickly render existing
>> .aurora config value obsolete. With that in mind, should we block
>> instance removal in the updater and let an explicit killTasks call be
>> the only acceptable action to reduce instance count? Is there any
>> value (aside from arguable convenience factor) in having
>> startJobUpdate ever killing instances?
>>
>> Thanks,
>> Maxim
>>
>> [1] - http://markmail.org/message/2smaej5n5e54li3g
>>

Mime
View raw message