flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Zhu <reed...@gmail.com>
Subject Re: Apache Flink - Question about application restart
Date Thu, 28 May 2020 03:32:46 GMT
Hi M,

Sorry I missed your message.
JobID will not change for a generated JobGraph. However, a new JobGraph
will be generated each time a job is submitted.
So that multiple submissions will have multiple JobGraphs. This is because
different submissions are considered as different jobs, as Till mentioned.
One example is that you can submit an application to a cluster multiple
times at the same time, different JobIDs are needed to differentiate them.

Thanks,
Zhu Zhu

Till Rohrmann <trohrmann@apache.org> 于2020年5月27日周三 下午10:05写道:

> Hi,
>
> if you submit the same job multiple times, then it will get every time a
> different JobID assigned. For Flink, different job submissions are
> considered to be different jobs. Once a job has been submitted, it will
> keep the same JobID which is important in order to retrieve the checkpoints
> associated with this job.
>
> Cheers,
> Till
>
> On Tue, May 26, 2020 at 12:42 PM M Singh <mans2singh@yahoo.com> wrote:
>
>> Hi Zhu Zhu:
>>
>> I have another clafication - it looks like if I run the same app multiple
>> times - it's job id changes.  So it looks like even though the graph is the
>> same the job id is not dependent on the job graph only since with different
>> runs of the same app it is not the same.
>>
>> Please let me know if I've missed anything.
>>
>> Thanks
>>
>> On Monday, May 25, 2020, 05:32:39 PM EDT, M Singh <mans2singh@yahoo.com>
>> wrote:
>>
>>
>> Hi Zhu Zhu:
>>
>> Just to clarify - from what I understand, EMR also has by default restart
>> times (I think it is 3). So if the EMR restarts the job - the job id is the
>> same since the job graph is the same.
>>
>> Thanks for the clarification.
>>
>> On Monday, May 25, 2020, 04:01:17 AM EDT, Yang Wang <
>> danrtsey.wy@gmail.com> wrote:
>>
>>
>> Just share some additional information.
>>
>> When deploying Flink application on Yarn and it exhausted restart policy,
>> then
>> the whole application will failed. If you start another instance(Yarn
>> application),
>> even the high availability is configured, we could not recover from the
>> latest
>> checkpoint because the clusterId(i.e. applicationId) has changed.
>>
>>
>> Best,
>> Yang
>>
>> Zhu Zhu <reedpor@gmail.com> 于2020年5月25日周一 上午11:17写道:
>>
>> Hi M,
>>
>> Regarding your questions:
>> 1. yes. The id is fixed once the job graph is generated.
>> 2. yes
>>
>> Regarding yarn mode:
>> 1. the job id keeps the same because the job graph will be generated once
>> at client side and persist in DFS for reuse
>> 2. yes if high availability is enabled
>>
>> Thanks,
>> Zhu Zhu
>>
>> M Singh <mans2singh@yahoo.com> 于2020年5月23日周六 上午4:06写道:
>>
>> Hi Flink Folks:
>>
>> If I have a Flink Application with 10 restarts, if it fails and restarts,
>> then:
>>
>> 1. Does the job have the same id ?
>> 2. Does the automatically restarting application, pickup from the last
>> checkpoint ? I am assuming it does but just want to confirm.
>>
>> Also, if it is running on AWS EMR I believe EMR/Yarn is configured to
>> restart the job 3 times (after it has exhausted it's restart policy) .  If
>> that is the case:
>> 1. Does the job get a new id ? I believe it does, but just want to
>> confirm.
>> 2. Does the Yarn restart honor the last checkpoint ?  I believe, it does
>> not, but is there a way to make it restart from the last checkpoint of the
>> failed job (after it has exhausted its restart policy) ?
>>
>> Thanks
>>
>>
>>

Mime
View raw message