flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1581) Configure DeathWatch parameters properly
Date Mon, 11 Jan 2016 23:23:39 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092932#comment-15092932

Fabian Hueske commented on FLINK-1581:

[~till.rohrmann], is this issue still valid?

> Configure DeathWatch parameters properly
> ----------------------------------------
>                 Key: FLINK-1581
>                 URL: https://issues.apache.org/jira/browse/FLINK-1581
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Till Rohrmann
> We are using Akka's DeathWath mechanism to detect failed components. However, the interval
until an {{Instance}} is marked dead is currently very long. Especially, in conjunction with
the job restarting mechanism we should devise a mechanism which either quickly detects dead
{{Instance}}s or  set the interval, pause and threshold values such that the detection does
not take longer than the Akka ask timeout value. Otherwise, all retries might be consumed
before an {{Instance}} is recognized being dead.
> Further investigation of the correct failure behavior is necessary.

This message was sent by Atlassian JIRA

View raw message