flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prashantnayak <prash...@intellifylearning.com>
Subject Re: S3 recovery and checkpoint directories exhibit explosive growth
Date Wed, 26 Jul 2017 14:57:30 GMT
Thanks Stephan and Stefan

We're looking forward to this patch in 1.3.2

We will use a patched version depending upon when 1.3.2 is going to be

We're also implementing a cron job to remove orphaned/older
completedCheckpoint files per your recommendations..  one caveat with a job
like that is that we have to check if a particular job is
stopped/paused/down and also if the Job Manager is down so we don't
accidentally remove valid checkpoint files..   this makes it a bit dicey....
ideal of course is not to have to do this. 

The move away from hadoop/s3 would be welcome as well.

Flink job state is critical to us since we have very long running jobs
(months) processing hundreds of millions of records.  


View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-recovery-and-checkpoint-directories-exhibit-explosive-growth-tp14270p14477.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

View raw message