flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Don Frascuchon <frascuc...@gmail.com>
Subject Re: Checkpoint for exact-once proccessing
Date Wed, 13 Jan 2016 11:25:11 GMT
Hi Stephan,

Thanks for your quickly response.

So, consider an operator task with two processed records and no barrier
incoming. If the task fail and must be records, the last consistent
snapshot will be used, which no includes information about the processed
but no checkpointed  records. What about this situation? The registers will
be resent to failed task after, or will be discarded? How flink manage
information about this records for exact-once guarantees? The user function
inside operator must be idempotent (i think about some kind of persistence
in  a sink task)

Thanks in advance !

El mié., 13 ene. 2016 a las 11:17, Stephan Ewen (<sewen@apache.org>)

> Hi!
> I think there is a misunderstanding. There are no identifiers maintained
> and no individual records deleted.
> On recovery, all operators reset their state to a consistent snapshot:
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/stream_checkpointing.html
> Greetings,
> Stephan
> On Wed, Jan 13, 2016 at 11:08 AM, Don Frascuchon <frascuchon@gmail.com>
> wrote:
>> Hello,
>> I'm trying to understand the process of checkpoint processing for
>> exact-once in Flink, and I have some doubts.
>> The documentation says that when there is a failure and the state of an
>> operator is restored, the already processed records are deleted based on
>> their identifiers.
>> My doubts is, how these identifiers between two checkpoints are
>> maintained? Every time a new input record comes to the stateful operator,
>> Flink persists it before making the checkpoint? Otherwise, there may be
>> messages to reprocess after a failure.
>> Thanks !!!

View raw message