aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: [PROPOSAL] Disallow instance removal in job update
Date Fri, 05 Feb 2016 23:31:04 GMT
Or without any persistence at all.  The client could refuse to adjust the
instance count on a job unless there's additional command line argument.
The same arguments of responsibility could be said here of users of old
clients or custom clients.

On Fri, Feb 5, 2016 at 3:17 PM, John Sirois <john@conductant.com> wrote:

> On Fri, Feb 5, 2016 at 4:07 PM, Maxim Khutornenko <maxim@apache.org>
> wrote:
>
> > We have had attempts to safeguard client updater command with a
> > "dangerous change" warning before but it did not get good feedback.
> > Besides, automated tools/scripts just ignored it.
> >
> > An alternative could be what George suggest on the scaling API thread
> > mentioned earlier: automatically bump up instance count to the job
> > active task count. I'd say this could be an implementation to the
> > proposal above rather than a safeguard as it accomplishes the exact
> > same goal.
> >
> > Bill, do you have any ideas of what that safeguard could be?
> >
>
> I'd recommend that an API call that reduced instance count require an
> `confirm_instance_reduction =true` parameter - this could be plumbed back
> to a flag in the official Aurora client.
> That said, since Aurora immediately forgets jobs and splits things into
> tasks, I'm not sure this is even sanely possible today.
>
> Assuming it is possible, any human that turns that flag on by default with
> a shell alias or an rc file can take responsibility for their own problem.
> If a tool passes the boolean, again - that's the tool's problem.  Hopefully
> its a carefully developed and vetted auto-scaling tool.
>
>
> > On Fri, Feb 5, 2016 at 2:56 PM, Bill Farner <wfarner@apache.org> wrote:
> > >>
> > >> the outdated instance count problem will only get worse as automated
> > >> scaling tools will quickly render existing .aurora config value
> obsolete
> > >
> > >
> > > This is not a compelling reason to remove functionality.  Sounds like a
> > > safeguard is needed instead.
> > >
> > > On Fri, Feb 5, 2016 at 2:43 PM, Maxim Khutornenko <maxim@apache.org>
> > wrote:
> > >
> > >> This is mostly a survey rather than a proposal. How would people think
> > >> about limiting updater to only adding/updating instances and let
> > >> killTasks take care of instance removals?
> > >>
> > >> We have all heard stories (or happen to create some ourselves) when an
> > >> outdated instance count value in .aurora config caused unexpected
> > >> instance removals. Granted, there are plenty of other values in the
> > >> config that can cause service-wide outage but instance count seems to
> > >> be the worst in that sense.
> > >>
> > >> After the recent refactoring of addInstances and killTasks to act as
> > >> scaleOut/scaleIn APIs [1], the outdated instance count problem will
> > >> only get worse as automated scaling tools will quickly render existing
> > >> .aurora config value obsolete. With that in mind, should we block
> > >> instance removal in the updater and let an explicit killTasks call be
> > >> the only acceptable action to reduce instance count? Is there any
> > >> value (aside from arguable convenience factor) in having
> > >> startJobUpdate ever killing instances?
> > >>
> > >> Thanks,
> > >> Maxim
> > >>
> > >> [1] - http://markmail.org/message/2smaej5n5e54li3g
> > >>
> >
>
>
>
> --
> John Sirois
> 303-512-3301
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message