aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Morozov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-1721) Support user initiated rollback
Date Fri, 29 Jul 2016 23:02:20 GMT

    [ https://issues.apache.org/jira/browse/AURORA-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400186#comment-15400186
] 

Igor Morozov commented on AURORA-1721:
--------------------------------------

we can't create new job update for rollback, at least not easily as our infrastructure has
been built around concept of rollbackable workflows. By creating new job update for rollback
we're going to mutate an update workflow state something we don't want to do for variety of
different reasons. 

pausing an update in ROLL_FORWARD_AWAITING_PULSE (or any new state for that matter) before
going to ROLLED_FORWARD state is just a way to implement a two phase commit for distributed
coordinated update. 

This is what we want to achieve with this change:

Coordinator starts an upgrade:
     dc1: -> starting update1 for job1
     dc2: -> staring update2 for job2
----
Coordinator:
     dc1: update1 is done, enters paused state
     dc2: update2 has failed, rolling back
----
Coordinator:
     dc1: starts rolling back update 1
     dc2: update 2 is rolled back
----
Coordinator:
    dc1: update 1 is rolled back
    dc2: update 2 is rolled back

Without entering an intermediate state for update job we would need to create a new update
as you suggested to rollback thus mutating the state of distributed workflow from (update1,
update2) to (update3, update2)

If somebody wants to rollback hours after upgrade is done they would need to roll forward
with the previous version (logical rollback)
The use case we're targeting is supporting fast rollbacks for distributed updates. 

> Support user initiated rollback 
> --------------------------------
>
>                 Key: AURORA-1721
>                 URL: https://issues.apache.org/jira/browse/AURORA-1721
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: Igor Morozov
>            Assignee: Igor Morozov
>              Labels: Uber
>             Fix For: 0.16.0
>
>
> The proposal to support user initiated rollback:
> 1. Create new thrift API:
>  /**Rollback job update. */
>   Response rollbackJobUpdate(
>       /** The update to rollback. */
>       1: JobUpdateKey key,
>       /** A user-specified message to include with the induced job update state change.
*/
>       3: string message)
> 2.  Implement new API in a scheduler so the implementation would just undo the latest
JobUpdate effectively trying to apply initialState to the job. If that is for some reason
is impossible them rollback with fail with appropriate error message.
> 3. Support new aurora client command 'rollback'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message