Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACACD186C5 for ; Wed, 20 Jan 2016 01:06:55 +0000 (UTC) Received: (qmail 26753 invoked by uid 500); 20 Jan 2016 01:06:55 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 26696 invoked by uid 500); 20 Jan 2016 01:06:55 -0000 Mailing-List: contact dev-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list dev@aurora.apache.org Received: (qmail 26685 invoked by uid 99); 20 Jan 2016 01:06:55 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jan 2016 01:06:55 +0000 Received: from mail-io0-f176.google.com (mail-io0-f176.google.com [209.85.223.176]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 02AB11A0323 for ; Wed, 20 Jan 2016 01:06:54 +0000 (UTC) Received: by mail-io0-f176.google.com with SMTP id g73so4205337ioe.3 for ; Tue, 19 Jan 2016 17:06:54 -0800 (PST) X-Gm-Message-State: ALoCoQm/Zx06XHvwgT2ZMFPkNjuGFes2PCjXRLkzN6RLRwf3hb4Q7RQlzjHx3y8ZmiVuNZwac1SvKrdfhEmt7s0O6+bKTSx/JA== MIME-Version: 1.0 X-Received: by 10.107.6.156 with SMTP id f28mr27213165ioi.46.1453252014059; Tue, 19 Jan 2016 17:06:54 -0800 (PST) Received: by 10.107.12.37 with HTTP; Tue, 19 Jan 2016 17:06:53 -0800 (PST) In-Reply-To: References: <1452848997962.73259@blue-yonder.com> <1453050051951.10303@blue-yonder.com> Date: Tue, 19 Jan 2016 17:06:53 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PROPOSAL] Job instance scaling APIs From: Maxim Khutornenko To: dev@aurora.apache.org Content-Type: text/plain; charset=UTF-8 Hmm, I have no idea where I got that old version from :) Sounds good to me. On Tue, Jan 19, 2016 at 5:02 PM, Bill Farner wrote: > Nit: your examples above include SessionKey, which no longer exists. > > Otherwise LGTM. > > On Tue, Jan 19, 2016 at 4:58 PM, Maxim Khutornenko wrote: > >> I think the behavior is spot on. >> >> So the plan is to move the current: >> >> Response addInstances( >> 1: AddInstancesConfig config, >> 2: Lock lock, >> 3: SessionKey session) >> >> to: >> >> Response addInstances( >> 1: AddInstancesConfig config, >> 2: Lock lock, >> 3: SessionKey session, >> 4: InstanceKey instanceKey, >> 5: int32 instanceCount) >> >> In the current release we will fork the behavior depending on whether >> the InstanceKey is present. In the next release, we drop 1,2 and 3 and >> end up with only 4 and 5. >> >> Any concerns? >> >> On Tue, Jan 19, 2016 at 3:57 PM, Bill Farner wrote: >> > Unless the behavior differs significantly, repurposing sounds like the >> > right thing to do. We don't *need* a separate RPC to accomplish this, we >> > can just add arguments or struct fields to accomplish dual use. >> > >> > On Tue, Jan 19, 2016 at 3:53 PM, Maxim Khutornenko >> wrote: >> > >> >> That thought crossed my mind but due to addInstances() being reserved >> >> for an existing RPC I filed AURORA-1581 to handle its graceful >> >> deprecation first. >> >> >> >> Bill, are you suggesting re-purposing the existing 'addInstances()' >> >> into a 'scaleOut()' equivalent? I am mostly +1 on the change but if we >> >> do it within the same release we will not be following our deprecation >> >> guidelines. Perhaps we can name it something like 'addTaskInstances()' >> >> instead and let addInstances() go away naturally? >> >> >> >> On Tue, Jan 19, 2016 at 3:06 PM, Tony Dong >> >> wrote: >> >> > +1 to addInstances >> >> > >> >> > On Tue, Jan 19, 2016 at 3:00 PM, Bill Farner >> wrote: >> >> > >> >> >> At risk of devolving the discussion, is it worth calling the method >> >> >> addInstances as opposed to scaleOut? I find the former more >> >> descriptive. >> >> >> >> >> >> On Tue, Jan 19, 2016 at 11:12 AM, Maxim Khutornenko < >> maxim@apache.org> >> >> >> wrote: >> >> >> >> >> >> > "Of course, the scaler could manually health check that all >> instances >> >> >> > have come up and are being used as expected, but I guess that is >> what >> >> >> > Aurora is for." >> >> >> > >> >> >> > I'd argue the updater "watch_secs" health checking isn't enough to >> >> >> > ensure graceful rollout as instances may start flapping right after >> >> >> > the updater signs off. Instances outside of update window may also >> >> >> > flap (e.g. due to backend pressure) and updater will not be able to >> >> >> > catch that. That's why a robust autoscaler has to rely on external >> >> >> > monitoring tools and overall job health instead. >> >> >> > >> >> >> > A very basic approach, as you mentioned above, could be querying >> job >> >> >> > status repeatedly and count the ratio of tasks in RUNNING vs active >> >> >> > (ASSIGNED, PENDING, THROTTLED, STARTING, etc.) states in order to >> make >> >> >> > a scaleOut decision. The more reliable approach though would also >> rely >> >> >> > on external monitoring stats exposed by user processes. That would >> be >> >> >> > a much higher fidelity signal than a decision based on task status >> >> >> > alone. Scheduler does not (and should not for scalability reasons) >> >> >> > have visibility into those stats, so the autoscaler would be in a >> much >> >> >> > better position to make an executive decision there. >> >> >> > >> >> >> > On Sun, Jan 17, 2016 at 9:00 AM, Erb, Stephan >> >> >> > wrote: >> >> >> > > I believe the operation is not that simple when you look at the >> >> >> > end-to-end scenario. >> >> >> > > >> >> >> > > For example, the implementation of an auto-scaler using the new >> >> >> > scaleOut() API could look like: >> >> >> > > >> >> >> > > 1) check some KPI >> >> >> > > 2) Infer an action based on this KPI such as scaleUp() or >> >> scaleDown() >> >> >> > > 3) wait until the effects of the adjusted instance count is >> >> reflected >> >> >> in >> >> >> > the KPI. Go to 1 and repeat. >> >> >> > > >> >> >> > > The health checking capabilities of the existing updater (in >> >> particular >> >> >> > together with [1]) would be really helpful here. Still, the >> simplified >> >> >> > scaleOut() API would offer the great benefit that the auto-scaler >> >> would >> >> >> not >> >> >> > need to know about the used aurora configuration. >> >> >> > > >> >> >> > > We even had an incident with a sub-optimal implementation of step >> >> 3): >> >> >> An >> >> >> > overloaded package backend lead to slow service startups. The >> service >> >> >> > startup took longer than the grace-period of our auto-scaler. It >> >> >> therefore >> >> >> > decided to add more and more instances, because the KPI wasn't >> >> improving >> >> >> as >> >> >> > expected. It had no way of knowing that these instances were not >> even >> >> >> > 'running'. The additionally added instances aggravated the overload >> >> >> > situation of the package backend. Of course, the scaler could >> >> manually >> >> >> > health check that all instances have come up and are being used as >> >> >> > expected, but I guess that is what Aurora is for. >> >> >> > > >> >> >> > > [1] >> >> >> > >> >> >> >> >> >> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit?pref=2&pli=1#heading=h.n0kb37aiy8ua >> >> >> > > >> >> >> > > Best Regards, >> >> >> > > Stephan >> >> >> > > ________________________________________ >> >> >> > > From: Maxim Khutornenko >> >> >> > > Sent: Friday, January 15, 2016 7:06 PM >> >> >> > > To: dev@aurora.apache.org >> >> >> > > Subject: Re: [PROPOSAL] Job instance scaling APIs >> >> >> > > >> >> >> > > I wasn't planning on using the rolling updater functionality >> given >> >> the >> >> >> > > simplicity of the operation. I'd second Steve's earlier concerns >> >> about >> >> >> > > scaleOut() looking more like startJobUpdate() if we keep adding >> >> >> > > features. If health watching, throttling (batch_size) or >> rollback on >> >> >> > > failure is required then I believe the startJobUpdate() should be >> >> used >> >> >> > > instead of scaleOut(). >> >> >> > > >> >> >> > > On Fri, Jan 15, 2016 at 1:09 AM, Erb, Stephan >> >> >> > > wrote: >> >> >> > >> I really like the proposal. The gain in simplicity on the >> >> client-side >> >> >> > by not having to provide an aurora config is quite significant. >> >> >> > >> >> >> >> > >> The implementation on the scheduler side is probably rather >> >> straight >> >> >> > forward as the update can be reused. That would also provide us >> with >> >> the >> >> >> > update UI, which has shown to be quite useful when tracing >> autoscaler >> >> >> > events. >> >> >> > >> >> >> >> > >> Regards, >> >> >> > >> Stephan >> >> >> > >> ________________________________________ >> >> >> > >> From: Maxim Khutornenko >> >> >> > >> Sent: Thursday, January 14, 2016 9:50 PM >> >> >> > >> To: dev@aurora.apache.org >> >> >> > >> Subject: Re: [PROPOSAL] Job instance scaling APIs >> >> >> > >> >> >> >> > >> "I'd be concerned that any >> >> >> > >> scaling API to be powerful enough to fit all (most) use cases >> would >> >> >> just >> >> >> > >> end up looking like the update API." >> >> >> > >> >> >> >> > >> There is a big difference between scaleOut and startJobUpdate >> APIs >> >> >> > >> that justifies the inclusion of the former. Namely, scaleOut may >> >> only >> >> >> > >> replicate the existing instances without changing/introducing >> any >> >> new >> >> >> > >> scheduling requirements or performing instance >> rollout/rollback. I >> >> >> > >> don't see scaleOut ever becoming more powerful to threaten >> >> >> > >> startJobUpdate. At the same time, the absence of aurora config >> >> >> > >> requirement is a huge boost to autoscaling client >> simplification. >> >> >> > >> >> >> >> > >> "For example, when scaling down we don't just kill the last N >> >> >> > instances, we >> >> >> > >> actually look at the least loaded hosts (globally) and kill >> tasks >> >> from >> >> >> > >> those." >> >> >> > >> >> >> >> > >> I don't quite see why the same wouldn't be possible with a >> scaleIn >> >> >> > >> API. Isn't it always external process responsibility to pay due >> >> >> > >> diligence before killing instances? >> >> >> > >> >> >> >> > >> >> >> >> > >> On Thu, Jan 14, 2016 at 12:35 PM, Steve Niemitz < >> >> sniemitz@apache.org> >> >> >> > wrote: >> >> >> > >>> As some background, we handle scale up / down purely from the >> >> client >> >> >> > side, >> >> >> > >>> using the update API for both directions. I'd be concerned >> that >> >> any >> >> >> > >>> scaling API to be powerful enough to fit all (most) use cases >> >> would >> >> >> > just >> >> >> > >>> end up looking like the update API. >> >> >> > >>> >> >> >> > >>> For example, when scaling down we don't just kill the last N >> >> >> > instances, we >> >> >> > >>> actually look at the least loaded hosts (globally) and kill >> tasks >> >> >> from >> >> >> > >>> those. >> >> >> > >>> >> >> >> > >>> >> >> >> > >>> On Thu, Jan 14, 2016 at 3:28 PM, Maxim Khutornenko < >> >> maxim@apache.org >> >> >> > >> >> >> > wrote: >> >> >> > >>> >> >> >> > >>>> "How is scaling down different from killing instances?" >> >> >> > >>>> >> >> >> > >>>> I found 'killTasks' syntax too different and way much more >> >> powerful >> >> >> to >> >> >> > >>>> be used for scaling in. The TaskQuery allows killing instances >> >> >> across >> >> >> > >>>> jobs/roles, whereas 'scaleIn' is narrowed down to just a >> single >> >> job. >> >> >> > >>>> Additional benefit: it can be ACLed independently by allowing >> >> >> external >> >> >> > >>>> process kill tasks only within a given job. We may also add >> rate >> >> >> > >>>> limiting or backoff to it later. >> >> >> > >>>> >> >> >> > >>>> As for Joshua's question, I feel it should be an operator's >> >> >> > >>>> responsibility to diff a job with its aurora config before >> >> applying >> >> >> an >> >> >> > >>>> update. That said, if there is enough demand we can definitely >> >> >> > >>>> consider adding something similar to what George suggested or >> >> >> > >>>> resurrecting a 'large change' warning message we used to have >> in >> >> >> > >>>> client updater. >> >> >> > >>>> >> >> >> > >>>> On Thu, Jan 14, 2016 at 12:06 PM, George Sirois < >> >> >> george@tellapart.com >> >> >> > > >> >> >> > >>>> wrote: >> >> >> > >>>> > As a point of reference, we solved this problem by adding a >> >> >> binding >> >> >> > >>>> helper >> >> >> > >>>> > that queries the scheduler for the current number of >> instances >> >> and >> >> >> > uses >> >> >> > >>>> > that number instead of a hardcoded config: >> >> >> > >>>> > >> >> >> > >>>> > instances='{{scaling_instances[60]}}' >> >> >> > >>>> > >> >> >> > >>>> > In this example, instances will be set to the currently >> running >> >> >> > number >> >> >> > >>>> > (unless there are none, in which case 60 instances will be >> >> >> created). >> >> >> > >>>> > >> >> >> > >>>> > On Thu, Jan 14, 2016 at 2:44 PM, Joshua Cohen < >> >> jcohen@apache.org> >> >> >> > wrote: >> >> >> > >>>> > >> >> >> > >>>> >> What happens if a job has been scaled out, but the >> underlying >> >> >> > config is >> >> >> > >>>> not >> >> >> > >>>> >> updated to take that scaling into account? Would the next >> >> update >> >> >> > on that >> >> >> > >>>> >> job revert the number of instances (presumably, because >> what >> >> else >> >> >> > could >> >> >> > >>>> we >> >> >> > >>>> >> do)? Is there anything we can do, tooling-wise, to improve >> >> upon >> >> >> > this? >> >> >> > >>>> >> >> >> >> > >>>> >> On Thu, Jan 14, 2016 at 1:40 PM, Maxim Khutornenko < >> >> >> > maxim@apache.org> >> >> >> > >>>> >> wrote: >> >> >> > >>>> >> >> >> >> > >>>> >> > Our rolling update APIs can be quite inconvenient to work >> >> with >> >> >> > when it >> >> >> > >>>> >> > comes to instance scaling [1]. It's especially >> frustrating >> >> when >> >> >> > >>>> >> > adding/removing instances has to be done in an automated >> >> >> fashion >> >> >> > >>>> (e.g.: >> >> >> > >>>> >> by >> >> >> > >>>> >> > an external autoscaling process) as it requires holding >> on >> >> to >> >> >> the >> >> >> > >>>> >> original >> >> >> > >>>> >> > aurora config at all times. >> >> >> > >>>> >> > >> >> >> > >>>> >> > I propose we add simple instance scaling APIs to address >> the >> >> >> > above. >> >> >> > >>>> Since >> >> >> > >>>> >> > Aurora job may have instances at different configs at any >> >> >> > moment, I >> >> >> > >>>> >> propose >> >> >> > >>>> >> > we accept an InstanceKey as a reference point when >> scaling >> >> out. >> >> >> > For >> >> >> > >>>> >> > example: >> >> >> > >>>> >> > >> >> >> > >>>> >> > /** Scales out a given job by adding more instances >> with >> >> >> the >> >> >> > task >> >> >> > >>>> >> > config of the templateKey. */ >> >> >> > >>>> >> > Response scaleOut(1: InstanceKey templateKey, 2: i32 >> >> >> > >>>> incrementCount) >> >> >> > >>>> >> > >> >> >> > >>>> >> > /** Scales in a given job by removing existing >> >> instances. >> >> >> */ >> >> >> > >>>> >> > Response scaleIn(1: JobKey job, 2: i32 >> decrementCount) >> >> >> > >>>> >> > >> >> >> > >>>> >> > A correspondent client command could then look like: >> >> >> > >>>> >> > >> >> >> > >>>> >> > aurora job scale-out devcluster/vagrant/test/hello/1 >> 10 >> >> >> > >>>> >> > >> >> >> > >>>> >> > For the above command, a scheduler would take task >> config of >> >> >> > instance >> >> >> > >>>> 1 >> >> >> > >>>> >> of >> >> >> > >>>> >> > the 'hello' job and replicate it 10 more times thus >> adding >> >> 10 >> >> >> > >>>> additional >> >> >> > >>>> >> > instances to the job. >> >> >> > >>>> >> > >> >> >> > >>>> >> > There are, of course, some details to work out like >> making >> >> sure >> >> >> > no >> >> >> > >>>> active >> >> >> > >>>> >> > update is in flight, scale out does not violate quota and >> >> etc. >> >> >> I >> >> >> > >>>> intend >> >> >> > >>>> >> to >> >> >> > >>>> >> > address those during the implementation as things >> progress. >> >> >> > >>>> >> > >> >> >> > >>>> >> > Does the above make sense? Any concerns/suggestions? >> >> >> > >>>> >> > >> >> >> > >>>> >> > Thanks, >> >> >> > >>>> >> > Maxim >> >> >> > >>>> >> > >> >> >> > >>>> >> > [1] - https://issues.apache.org/jira/browse/AURORA-1258 >> >> >> > >>>> >> > >> >> >> > >>>> >> >> >> >> > >>>> >> >> >> > >> >> >> >> >> >>