flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3184) Decrease Akka timeouts on cluster side to make system more responsive
Date Fri, 18 Dec 2015 11:25:46 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063840#comment-15063840

ASF GitHub Bot commented on FLINK-3184:

Github user tillrohrmann commented on the pull request:

    This is true. Might be a bit anticipated but I plan to remove them completely with my
next PR. I want to introduce a `RestartStrategy` which can be set on a job basis and basically
encapsulates the restart logic.

> Decrease Akka timeouts on cluster side to make system more responsive
> ---------------------------------------------------------------------
>                 Key: FLINK-3184
>                 URL: https://issues.apache.org/jira/browse/FLINK-3184
>             Project: Flink
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
> Currently, the default timeout for futures is set to 100 s. This also the time used to
wait in between restart attempts if no other value has been explicitly specified. Especially
in the streaming case, it is often necessary to detect failures and to react to failures in
shorter period than 100 s. Therefore, I propose to decrease the default timeout to 10 s.
> Additionally, I propose to introduce a slightly higher timeout for the client side (e.g.
60 s). The reason is that in case of a {{JobManager}} the client has to wait until the cluster
has recovered. Using ZooKeeper for that can entail a longer timeout than 10 s. In such a case
a recovery could be falsely recognized as a lost connection.

This message was sent by Atlassian JIRA

View raw message