flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Pryakhin <m.prya...@gmail.com>
Subject Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry
Date Fri, 26 Oct 2018 14:31:41 GMT
Hi Andrey, Thanks a lot for your reply!

> What was the full job life cycle? 

1. The job is deployed as a YARN cluster with the following properties set

	high-availability: zookeeper
	high-availability.zookeeper.quorum: <a list of zookeeper hosts>
	high-availability.zookeeper.storageDir: hdfs:///<recovery-folder-path>
	high-availability.zookeeper.path.root: <flink-root-path>
	high-availability.zookeeper.path.namespace: <flink-job-name>

2. The job is cancelled via flink cancel <job-id> command.

   What I've noticed:
	when the job is running the following directory structure is created in zookeeper

	/<flink-root-path>/<flink-job-name>/leader/resource_manager_lock
	/<flink-root-path>/<flink-job-name>/leader/rest_server_lock
	/<flink-root-path>/<flink-job-name>/leader/dispatcher_lock
	/<flink-root-path>/<flink-job-name>/leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/resource_manager_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/rest_server_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/dispatcher_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
	/<flink-root-path>/<flink-job-name>/checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/0000000000000000041
	/<flink-root-path>/<flink-job-name>/checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde
	/<flink-root-path>/<flink-job-name>/running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde


	when the job is cancelled the some ephemeral nodes disappear, but most of them are still
there:

	/<flink-root-path>/<flink-job-name>/leader/5c21f00b9162becf5ce25a1cf0e67cde
	/<flink-root-path>/<flink-job-name>/leaderlatch/resource_manager_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/rest_server_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/dispatcher_lock
	/<flink-root-path>/<flink-job-name>/leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
	/<flink-root-path>/<flink-job-name>/checkpoints/
	/<flink-root-path>/<flink-job-name>/checkpoint-counter/
	/<flink-root-path>/<flink-job-name>/running_job_registry/

> Did you start it with Flink 1.6.1 or canceled job running with 1.6.0? 

I start the job with Flink-1.6.1


> Was there a failover of Job Master while running before the cancelation?

no there was no failover, as the job is deployed as a YARN cluster,  (YARN Cluster High Availability
guide states that no failover is required)

> What version of Zookeeper do you use?

Zookeer-3.4.10

> In general, it should not be the case and all job related data should be cleaned from
Zookeeper upon cancellation.

as far as I understood the issue concerns a JobManager failover process and my question is
about a manual intended cancellation of a job.

Here is the method [1] responsible for cleaning zookeeper folders up [1] which is called when
the job manager has stopped [2]. 
And it seems it only cleans up the folder running_job_registry, other folders stay untouched.
I supposed that everything under the /<flink-root-path>/<flink-job-name>/ folder
is cleaned up when the job is cancelled.


[1] https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/zookeeper/ZooKeeperRunningJobsRegistry.java#L107
[2] https://github.com/apache/flink/blob/f087f57749004790b6f5b823d66822c36ae09927/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L332


Kind Regards,
Mike Pryakhin

> On 26 Oct 2018, at 12:39, Andrey Zagrebin <andrey@data-artisans.com> wrote:
> 
> Hi Mike,
> 
> What was the full job life cycle? 
> Did you start it with Flink 1.6.1 or canceled job running with 1.6.0? 
> Was there a failover of Job Master while running before the cancelation?
> What version of Zookeeper do you use?
> 
> Flink creates child nodes to create a lock for the job in Zookeeper.
> Lock is removed by removing child node (ephemeral).
> Persistent node can be a problem because if job dies and does not remove it, 
> persistent node will not timeout and disappear as ephemeral one 
> and the next job instance will not delete it because it is supposed to be locked by the
previous.
> 
> There was a recent fix in 1.6.1 where the job data was not properly deleted from Zookeeper
[1].
> In general, it should not be the case and all job related data should be cleaned from
Zookeeper upon cancellation.
> 
> Best,
> Andrey
> 
> [1] https://issues.apache.org/jira/browse/FLINK-10011 <https://issues.apache.org/jira/browse/FLINK-10011>
> 
>> On 25 Oct 2018, at 15:30, Mikhail Pryakhin <m.pryahin@gmail.com <mailto:m.pryahin@gmail.com>>
wrote:
>> 
>> Hi Flink experts!
>> 
>> When a streaming job with Zookeeper-HA enabled gets cancelled all the job-related
Zookeeper nodes are not removed. Is there a reason behind that? 
>> I noticed that Zookeeper paths are created of type "Container Node" (an Ephemeral
node that can have nested nodes) and fall back to Persistent node type in case Zookeeper doesn't
support this sort of nodes. 
>> But anyway, it is worth removing the job Zookeeper node when a job is cancelled,
isn't it?
>> 
>> Thank you in advance!
>> 
>> Kind Regards,
>> Mike Pryakhin
>> 
> 


Mime
View raw message