aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AURORA-1240) Ignore JobUpdateSettings.maxWaitToInstanceRunningMs in the scheduler
Date Tue, 07 Apr 2015 00:31:12 GMT

     [ https://issues.apache.org/jira/browse/AURORA-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bill Farner updated AURORA-1240:
--------------------------------
    Component/s:     (was: Client)
    Description: 
The UpdateConfig {{restart_theshold}} \[1\] setting does not appear to deliver much user value
as it's highly sensitive to scheduling performance and may result in aborted/rolled back job
updates when set too low.

Some background: This timeout controls task transition from {{PENDING}} to {{RUNNING}} during
the job update. In the event of cluster capacity shortage, assigning a task to a host may
take considerably longer thus expiring the timeout and depending on the failure settings causing
an unnecessary job update abort or rollback. It was meant to give users some protection against
unsatisfiable resource/constraint requirements. In reality though, it proved to be rather
an annoyance to users when an update is interrupted due to unexpected delay in task assignment.

Consider deprecating and subsequently removing this setting.

This ticket tracks a first step to ignore this value in the scheduler updater.  See linked
tickets for follow-up work.


\[1\] - https://github.com/apache/aurora/blob/master/docs/configuration-reference.md#updateconfig-objects

  was:
The UpdateConfig {{restart_theshold}} \[1\] setting does not appear to deliver much user value
as it's highly sensitive to scheduling performance and may result in aborted/rolled back job
updates when set too low.

Some background: This timeout controls task transition from {{PENDING}} to {{RUNNING}} during
the job update. In the event of cluster capacity shortage, assigning a task to a host may
take considerably longer thus expiring the timeout and depending on the failure settings causing
an unnecessary job update abort or rollback. It was meant to give users some protection against
unsatisfiable resource/constraint requirements. In reality though, it proved to be rather
an annoyance to users when an update is interrupted due to unexpected delay in task assignment.

Consider deprecating and subsequently removing this setting.

\[1\] - https://github.com/apache/aurora/blob/master/docs/configuration-reference.md#updateconfig-objects

        Summary: Ignore JobUpdateSettings.maxWaitToInstanceRunningMs in the scheduler  (was:
Deprecate UpdateConfig "restart_threshold" setting)

> Ignore JobUpdateSettings.maxWaitToInstanceRunningMs in the scheduler
> --------------------------------------------------------------------
>
>                 Key: AURORA-1240
>                 URL: https://issues.apache.org/jira/browse/AURORA-1240
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Bill Farner
>
> The UpdateConfig {{restart_theshold}} \[1\] setting does not appear to deliver much user
value as it's highly sensitive to scheduling performance and may result in aborted/rolled
back job updates when set too low.
> Some background: This timeout controls task transition from {{PENDING}} to {{RUNNING}}
during the job update. In the event of cluster capacity shortage, assigning a task to a host
may take considerably longer thus expiring the timeout and depending on the failure settings
causing an unnecessary job update abort or rollback. It was meant to give users some protection
against unsatisfiable resource/constraint requirements. In reality though, it proved to be
rather an annoyance to users when an update is interrupted due to unexpected delay in task
assignment.
> Consider deprecating and subsequently removing this setting.
> This ticket tracks a first step to ignore this value in the scheduler updater.  See linked
tickets for follow-up work.
> \[1\] - https://github.com/apache/aurora/blob/master/docs/configuration-reference.md#updateconfig-objects



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message