Mailing-List: contact issues-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@flink.apache.org
Date: Fri, 26 Feb 2016 16:22:18 +0000 (UTC)
From: "ASF GitHub Bot (JIRA)" <jira@apache.org>
To: issues@flink.apache.org
Message-ID: <JIRA.12940218.1455818488000.156960.1456503738132@Atlassian.JIRA>
In-Reply-To: <JIRA.12940218.1455818488000@Atlassian.JIRA>
References: <JIRA.12940218.1455818488000@Atlassian.JIRA>
 <JIRA.12940218.1455818488381@arcas>
Subject: [jira] [Commented] (FLINK-3443) JobManager cancel and clear
 everything fails jobs instead of cancelling
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/FLINK-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169252#comment-15169252 ] 

ASF GitHub Bot commented on FLINK-3443:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/1726

    [FLINK-3443] [runtime] Prevent cancelled jobs from restarting

    This is one part of #1669 for which we have consensus I think. It would be good to have it in the next RC. @rmetzger ran into this issue. It showed up as a job not cancelling, because a failure was overwriting the cancellation.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink cancel-overwrites-fail

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1726
    
----
commit 1c11436f85697333b900830d4cb7e74cc1059f48
Author: Ufuk Celebi <uce@apache.org>
Date:   2016-02-26T16:12:05Z

    [FLINK-3443] [runtime] Prevent cancelled jobs from restarting

----


> JobManager cancel and clear everything fails jobs instead of cancelling
> -----------------------------------------------------------------------
>
>                 Key: FLINK-3443
>                 URL: https://issues.apache.org/jira/browse/FLINK-3443
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Runtime
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>
> When the job manager is shut down, it calls {{cancelAndClearEverything}}. This method does not {{cancel}} the {{ExecutionGraph}} instances, but {{fail}}s them, which can lead to {{ExecutionGraph}} restart.
> I've noticed this in tests, where old graph got into a loop of restarts.
> What I don't understand is why the futures etc. are not cancelled when the executor service is shut down.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)