Return-Path: X-Original-To: apmail-flink-issues-archive@minotaur.apache.org Delivered-To: apmail-flink-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 02C8218DA5 for ; Thu, 10 Mar 2016 20:32:41 +0000 (UTC) Received: (qmail 78667 invoked by uid 500); 10 Mar 2016 20:32:40 -0000 Delivered-To: apmail-flink-issues-archive@flink.apache.org Received: (qmail 78620 invoked by uid 500); 10 Mar 2016 20:32:40 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 78583 invoked by uid 99); 10 Mar 2016 20:32:40 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Mar 2016 20:32:40 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id B54E32C14F4 for ; Thu, 10 Mar 2016 20:32:40 +0000 (UTC) Date: Thu, 10 Mar 2016 20:32:40 +0000 (UTC) From: "Stephan Ewen (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (FLINK-3594) StreamTask may fail when checkpoint is concurrent to regular termination MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/FLINK-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen reassigned FLINK-3594: ----------------------------------- Assignee: Stephan Ewen > StreamTask may fail when checkpoint is concurrent to regular termination > ------------------------------------------------------------------------ > > Key: FLINK-3594 > URL: https://issues.apache.org/jira/browse/FLINK-3594 > Project: Flink > Issue Type: Bug > Reporter: Chesnay Schepler > Assignee: Stephan Ewen > Labels: test-stability > > Some tests in the KafkaConsumerTestBase rely on throwing a SuccessException to stop the streaming job if the test condition is fulfilled. > The job then fails, and it is checked whether the cause was a SuccessException. if so, the test is marked as a success, otherwise as a failure. > However, should this exception be thrown while a checkpoint is being triggered, the exception that stop the job is not the SuccessException, but a CancelTaskException. > This should affect every test that uses the SuccessException. > observed here: https://travis-ci.org/apache/flink/jobs/114523189 > The problem is that the exception causes the StreamTask to enter the finally block inside invoke(), which sets isRunning to false. Within triggerCheckpoint() isRunning is then checked for being false, and if so a CancelTaskException is thrown. > This seems like an issue of the runtime; i observed other tests failing, without giving a good cause since the CancelTaskException masks it. > I was wondering whether triggerCheckpoint() could return false instead of throwing an exception, and simply assume that an exception will be thrown within invoke(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)