flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prashantnayak <prash...@intellifylearning.com>
Subject Re: S3 recovery and checkpoint directories exhibit explosive growth
Date Wed, 26 Jul 2017 14:57:30 GMT
Thanks Stephan and Stefan

We're looking forward to this patch in 1.3.2

We will use a patched version depending upon when 1.3.2 is going to be
available.

We're also implementing a cron job to remove orphaned/older
completedCheckpoint files per your recommendations..  one caveat with a job
like that is that we have to check if a particular job is
stopped/paused/down and also if the Job Manager is down so we don't
accidentally remove valid checkpoint files..   this makes it a bit dicey....
ideal of course is not to have to do this. 

The move away from hadoop/s3 would be welcome as well.

Flink job state is critical to us since we have very long running jobs
(months) processing hundreds of millions of records.  

Thanks
Prashant



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14477.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Mime
View raw message