flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mingleizhang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (FLINK-6643) Flink restarts job in HA even if NoRestartStrategy is set
Date Thu, 08 Jun 2017 01:07:18 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042036#comment-16042036
] 

mingleizhang edited comment on FLINK-6643 at 6/8/17 1:06 AM:
-------------------------------------------------------------

Thanks for review [~till.rohrmann]. Just since the logging said {{Using restart strategy NoRestartStrategy
for f94b1f7a0e9e3dbcb160c687e476ca77}} with the corresponding code {{log.info(s"Using restart
strategy $restartStrategy for $jobId.")}} . It just seems confuse people who first see this
kinda logging. Doesn't it consume performance if recover a job without really need to do it
, or it probably for fault-tolerant or something like that. This issue and PR will be close
this night here if there is no more question. 


was (Author: mingleizhang):
Thanks for review [~till.rohrmann]. Just since the logging said {{Using restart strategy NoRestartStrategy
for f94b1f7a0e9e3dbcb160c687e476ca77}} with the corresponding code {{log.info(s"Using restart
strategy $restartStrategy for $jobId.")}} . It just seems confuse people who first this kinda
logging. Doesn't it consume performance if recover a job without really need to do it , or
it probably for fault-tolerant or something like that. This issue and PR will be close this
night here if there is no more question. 

> Flink restarts job in HA even if NoRestartStrategy is set
> ---------------------------------------------------------
>
>                 Key: FLINK-6643
>                 URL: https://issues.apache.org/jira/browse/FLINK-6643
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.3.0
>            Reporter: Robert Metzger
>            Assignee: mingleizhang
>            Priority: Critical
>              Labels: flink-rel-1.3.1-blockers
>
> While testing Flink 1.3 RC1, I found that the JobManager is trying to recover a job that
had the {{NoRestartStrategy}} set.
> {code}
> 2017-05-19 15:09:04,038 INFO  org.apache.flink.yarn.YarnJobManager                  
       - Attempting to recover all jobs.
> 2017-05-19 15:09:04,039 DEBUG org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore
 - Retrieving all stored job ids from ZooKeeper under flink/application_1494870922226_0064/jobgraphs.
> 2017-05-19 15:09:04,041 INFO  org.apache.flink.yarn.YarnJobManager                  
       - There are 1 jobs to recover. Starting the job recovery.
> 2017-05-19 15:09:04,043 INFO  org.apache.flink.yarn.YarnJobManager                  
       - Attempting to recover job f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,043 DEBUG org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore
 - Recovering job graph f94b1f7a0e9e3dbcb160c687e476ca77 from flink/application_1494870922226_0064/jobgraphs/f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,078 WARN  org.apache.hadoop.util.NativeCodeLoader               
       - Unable to load native-hadoop library for your platform... using builtin-java classes
where applicable
> 2017-05-19 15:09:04,142 INFO  org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore
 - Recovered SubmittedJobGraph(f94b1f7a0e9e3dbcb160c687e476ca77, JobInfo(clients: Set((Actor[akka.tcp://flink@permanent-qa-cluster-master.c.astral-sorter-757.internal:40391/user/$a#-155566858],EXECUTION_RESULT_AND_STATE_CHANGES)),
start: 1495206476885)).
> 2017-05-19 15:09:04,142 INFO  org.apache.flink.yarn.YarnJobManager                  
       - Submitting recovered job f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,143 INFO  org.apache.flink.yarn.YarnJobManager                  
       - Submitting job f94b1f7a0e9e3dbcb160c687e476ca77 (CarTopSpeedWindowingExample) (Recovery).
> 2017-05-19 15:09:04,151 INFO  org.apache.flink.yarn.YarnJobManager                  
       - Using restart strategy NoRestartStrategy for f94b1f7a0e9e3dbcb160c687e476ca77.
> 2017-05-19 15:09:04,163 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph
       - Job recovers via failover strategy: full graph restart
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message