flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Force enabling checkpoints for iterative streaming jobs
Date Tue, 09 Jun 2015 22:53:16 GMT
Hey Gyula,

I understand your reasoning, but I don't think its worth to rush this into the release.

As you've said, we cannot give precise guarantees. But this is arguably one of the key requirements
for any fault tolerance mechanism. Therefore I disagree that this is better than not having
anything at all. I think it will already go a long way to have the non-iterative case working
reliably.

And as far as I know there are no users really suffering from this at the moment (in the sense
that someone has complained on the mailing list).

Hence, I vote to postpone this.

– Ufuk

On 10 Jun 2015, at 00:19, Gyula Fóra <gyfora@apache.org> wrote:

> Hey all,
> 
> It is currently impossible to enable state checkpointing for iterative
> jobs, because en exception is thrown when creating the jobgraph. This
> behaviour is motivated by the lack of precise guarantees that we can give
> with the current fault-tolerance implementations for cyclic graphs.
> 
> This PR <https://github.com/apache/flink/pull/812> adds an optional flag to
> force checkpoints even in case of iterations. The algorithm will take
> checkpoints periodically as before, but records in transit inside the loop
> will be lost.
> 
> However even this guarantee is enough for most applications (Machine
> Learning for instance) and certainly much better than not having anything
> at all.
> 
> 
> I suggest we add this to the 0.9 release as currently many applications
> suffer from this limitation (SAMOA, ML pipelines, graph streaming etc.)
> 
> 
> Cheers,
> 
> Gyula


Mime
View raw message