aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Sweeney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-45) Scheduler should wait for registered to be called before attempting to invoke driver
Date Thu, 16 Jan 2014 22:55:22 GMT

    [ https://issues.apache.org/jira/browse/AURORA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874104#comment-13874104
] 

Kevin Sweeney commented on AURORA-45:
-------------------------------------

Did the TaskStateMachine refactor fix this or is it still being observed?

> Scheduler should wait for registered to be called before attempting to invoke driver
> ------------------------------------------------------------------------------------
>
>                 Key: AURORA-45
>                 URL: https://issues.apache.org/jira/browse/AURORA-45
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Bill Farner
>            Assignee: Bill Farner
>
> We have observed the scheduler attempting to kill tasks before {{registered()}} had been
called. This resulted in the driver dropping those attempts on the floor. Since the driver
didn't signal failure to the scheduler (but instead logged an error) the scheduler wrote a
KILLING state transition to the replicated log and signaled success to the client. Since the
{{killTasks}} message was never sent the task timed out and the task continued to run until
the GC executor reconciled state.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message