flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yuqi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9143) Restart strategy defined in flink-conf.yaml is ignored
Date Sat, 07 Apr 2018 08:29:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429282#comment-16429282
] 

yuqi commented on FLINK-9143:
-----------------------------

As far as i know, if you set 
{code:java}
env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);{code}
by default, flink will use `fixedDelayRestart`, see blow:
{code:java}
private void configureCheckpointing() {
 CheckpointConfig cfg = streamGraph.getCheckpointConfig();

 long interval = cfg.getCheckpointInterval();
 if (interval > 0) {
 // check if a restart strategy has been set, if not then set the FixedDelayRestartStrategy
 if (streamGraph.getExecutionConfig().getRestartStrategy() == null) {
 // if the user enabled checkpointing, the default number of exec retries is infinite.
 streamGraph.getExecutionConfig().setRestartStrategy(
 RestartStrategies.fixedDelayRestart(Integer.MAX_VALUE, DEFAULT_RESTART_DELAY));
 }
 } else {
 // interval of max value means disable periodic checkpoint
 interval = Long.MAX_VALUE;
 }{code}
So, this is not a bug, what's your option?[~till.rohrmann]

> Restart strategy defined in flink-conf.yaml is ignored
> ------------------------------------------------------
>
>                 Key: FLINK-9143
>                 URL: https://issues.apache.org/jira/browse/FLINK-9143
>             Project: Flink
>          Issue Type: Bug
>          Components: Configuration
>    Affects Versions: 1.4.2
>            Reporter: Alex Smirnov
>            Priority: Major
>         Attachments: execution_config.png, jobmanager.log, jobmanager.png
>
>
> Restart strategy defined in flink-conf.yaml is disregarded, when user enables checkpointing.
> Steps to reproduce:
> 1. Download flink distribution (1.4.2), update flink-conf.yaml:
>  
> restart-strategy: none
> state.backend: rocksdb
> state.backend.fs.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-metadata
> state.backend.rocksdb.checkpointdir: file:///tmp/nfsrecovery/flink-checkpoints-rocksdb
>  
> 2. create new java project as described at [https://ci.apache.org/projects/flink/flink-docs-release-1.4/quickstart/java_api_quickstart.html]
> here's the code:
> public class FailedJob
> {
>     static final Logger LOGGER = LoggerFactory.getLogger(FailedJob.class);
>     public static void main( String[] args ) throws Exception
>     {
>         final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
>         env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);
>         DataStream<String> stream = env.fromCollection(Arrays.asList("test"));
>         stream.map(new MapFunction<String, String>(){
>             @Override
>             public String map(String obj) {
>                 throw new NullPointerException("NPE");
>             } 
>         });
>         env.execute("Failed job");
>     }
> }
>  
> 3. Compile: mvn clean package; submit it to the cluster
>  
> 4. Go to Job Manager configuration in WebUI, ensure settings from flink-conf.yaml is
there (screenshot attached)
>  
> 5. Go to Job's configuration, see Execution Configuration section
>  
> *Expected result*: restart strategy as defined in flink-conf.yaml
>  
> *Actual result*: Restart with fixed delay (10000 ms). #[2147483647|tel:(214)%20748-3647] restart
attempts.
>  
>  
> see attached screenshots and jobmanager log (line 1 and 31)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message