aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <>
Subject Re: Handling of aurora update when job/task cannot be scheduled
Date Thu, 15 May 2014 16:35:16 GMT
Thanks for the proposal, Anindya. I do have some concerns about it though:

- The task in PENDING state might never get scheduled due to variety of
reasons, e.g.: unsatisfied constraints, unreasonable resource requirements
and etc. Furthermore, if the task eventually gets scheduled, it may never
reach RUNNING or more likely fail repeatedly and get THROTTLED for
flapping. Neither can be considered a successful update criteria.

- Assuming PENDING tasks are only blocked by the lack of resources and get
unblocked eventually, having hundreds or thousands of PENDING tasks
transitioning to RUNNING at the same time may and most likely will result
in an unpredictable performance problems on the package retrieval (e.g.:
HDFS) or application side (e.g.: backend connections/load).

- A mixed update (with updating existing instances and adding new) may
result in a degraded service state where some or all instances may be
killed with no replacement coming online (i.e. a new update config has
resource bump that cannot be satisfied). In the worst case, this may result
in a complete service outage.


On Wed, May 14, 2014 at 11:01 PM, Anindya Sinha <>wrote:

> Hi
> Wanted to propose a modification in handling of aurora update when the job
> or a task cannot be scheduled immediately based on my understanding of job
> scheduling within aurora.
> Please feel free to share your comments and/or concerns.
> Thanks
> Anindya
> *Scenario*
> Assume we have a job with 2 RUNNING instances (say instance 0 and 1) in the
> cluster, and then "aurora update" is issued on the same job key which bumps
> up the instance count to say 5. By default, it keeps instance 0 and 1
> intact, and attempts to launch 3 additional instances and waits for
> UpdateConfig.watch_secs for it to be in RUNNING state before moving on for
> each instance.
> Assume the cluster is in a state where only 1 additional instance can be
> launched due to resource unavailability. Hence, instance 2 is executed (is
> in RUNNING state) and instance 3 moves to PENDING state and when
> UpdateConfig.restart_threshold expires, it deems this instance to be a
> failed instance.
> If UpdateConfig.rollback_on_failure is True(default), it rolls back the
> changes done in the update and terminates instances 2 and 3.
> If UpdateConfig.rollback_on_failure is False, it does NOP keeping instances
> 0 through 2 in RUNNING, and instance 3 in PENDING. Instance 4 is never
> attempted in either of the scenarios.
> *Proposal*
> I propose that in aurora update, we should consider an instance in PENDING
> state after UpdateConfig.restart_threshold timeout NOT to be failed case
>  (and keep them in PENDING state). The reason behind this is that these
> instances which could not be scheduled to execute at the time of aurora
> update can be scheduled eventually in the future once there is a host in
> the cluster that becomes available to run these instances (based on
> resource availability in the future).
> In the current approach, instance 4 is not even attempted to be scheduled
> since instance 3 is considered to be a failure. Further, the scheduling of
> jobs within aurora update should ideally be treated similar to aurora
> create (since in case of a aurora create with instance count=5, we would
> have 3 RUNNING instances and 2 instances in PENDING state assuming the
> cluster is in a similar state).
> UpdateConfig.rollback_on_failure=False does not address the above use case
> for all scenarios since:
> a) It works if the PENDING instance is the last instance to be launched,
> but fails if there are additional instances to be launched (as in the
> example above).
> b) It disables rollback which may not be desirable for "real" failures to
> launch tasks in the cluster.
> Here is a JIRA that references this issue (which contains the same details
> as in this email though):
> Reference:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message