flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3187) Decouple restart strategy from ExecutionGraph
Date Thu, 07 Jan 2016 14:12:39 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087394#comment-15087394

ASF GitHub Bot commented on FLINK-3187:

Github user tillrohrmann commented on the pull request:

    Thanks for the review @rmetzger.
    I think this is not a problem, because the user cannot define restart strategies. In order
to set a `RestartStrategy`, the user has to provide a `RestartStrategyConfiguration`. The
`RestartStrategyConfiguration` cannot be extended outside the `RestartStrategies` class so
that the user cannot define his own `RestartStrategyConfiguration`. Additionally, the strategy
itself will only be instantiated from this configuration on the `JobManager` via the `RestartStrategyFactory`.
This is also code which cannot be changed by the user via the API.

> Decouple restart strategy from ExecutionGraph
> ---------------------------------------------
>                 Key: FLINK-3187
>                 URL: https://issues.apache.org/jira/browse/FLINK-3187
>             Project: Flink
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
> Currently, the {{ExecutionGraph}} supports the following restart logic: Whenever a failure
occurs and the number of restart attempts aren't depleted, wait for a fixed amount of time
and then try to restart. This behaviour can be controlled by the configuration parameters
{{execution-retries.default}} and {{execution-retries.delay}}.
> I propose to decouple the restart logic from the {{ExecutionGraph}} a bit by introducing
a strategy pattern. That way it would not only allow us to define a job specific restart behaviour
but also to implement different restart strategies. Conceivable strategies could be: Fixed
timeout restart, exponential backoff restart, partial topology restarts, etc.
> This change is a preliminary step towards having a restart strategy which will scale
the parallelism of a job down in case that not enough slots are available.

This message was sent by Atlassian JIRA

View raw message