flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Question about Flink's savepoint
Date Thu, 14 Sep 2017 13:21:06 GMT
Hi,

What is the source you're using in your Job and what filesystem (if any) is it writing to?

Best,
Aljoscha
> On 5. Sep 2017, at 03:06, Mu Kong <kong.mu.biz@gmail.com> wrote:
> 
> Hi all,
> 
> I have some questions about the experience I had with the save point.
> So, last night I found my flink cluster's memory usage seemed wired, so I
> decided to
> 
> 1. create a savepoint for the running job(there was only one job running at
> the time)
> 2. and then cancel the job from web UI
> 3. and restart the cluster
> 
> and when I tried to resume the job with the savepoint, there was a
> "Truncate did not truncate to right length. Should be 11757 is 56383."
> exception.
> Because there is also a savepoint being created every 4 a.m. in the
> morning, so after I failed to run the job with the savepoint I created
> before I canceled the job, I tried to use the 4 a.m. savepoint instead, and
> it seemed to work well.
> 
> Then this morning, I noticed there is data lost for the time after I cancel
> the job and before I resume the job.
> 
> I thought if I run the job with savepoint created in 4 a.m., it should
> start to process data from 4 a.m., or I'm missing something here?
> 
> Also, I didn't add uid to the addSource() function, maybe when I restarted
> the cluster the auto-generated id has been changed and that might be the
> reason why the recovery didn't go well?


Mime
View raw message