Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE82410857 for ; Thu, 16 Jan 2014 22:57:46 +0000 (UTC) Received: (qmail 95744 invoked by uid 500); 16 Jan 2014 22:57:46 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 95678 invoked by uid 500); 16 Jan 2014 22:57:46 -0000 Mailing-List: contact dev-help@aurora.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.incubator.apache.org Delivered-To: mailing list dev@aurora.incubator.apache.org Received: (qmail 95669 invoked by uid 99); 16 Jan 2014 22:57:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jan 2014 22:57:45 +0000 X-ASF-Spam-Status: No, hits=-2000.1 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 16 Jan 2014 22:57:43 +0000 Received: (qmail 95322 invoked by uid 99); 16 Jan 2014 22:57:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jan 2014 22:57:20 +0000 Date: Thu, 16 Jan 2014 22:57:20 +0000 (UTC) From: "Bill Farner (JIRA)" To: dev@aurora.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AURORA-45) Scheduler should wait for registered to be called before attempting to invoke driver MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/AURORA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874106#comment-13874106 ] Bill Farner commented on AURORA-45: ----------------------------------- Nope, i explicitly left this behavior in tact during that refactor to minimally impact tests. The fix is trivial after that refactor, though. > Scheduler should wait for registered to be called before attempting to invoke driver > ------------------------------------------------------------------------------------ > > Key: AURORA-45 > URL: https://issues.apache.org/jira/browse/AURORA-45 > Project: Aurora > Issue Type: Bug > Components: Scheduler > Reporter: Bill Farner > Assignee: Bill Farner > > We have observed the scheduler attempting to kill tasks before {{registered()}} had been called. This resulted in the driver dropping those attempts on the floor. Since the driver didn't signal failure to the scheduler (but instead logged an error) the scheduler wrote a KILLING state transition to the replicated log and signaled success to the client. Since the {{killTasks}} message was never sent the task timed out and the task continued to run until the GC executor reconciled state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)