flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Knauf <konstantin.kn...@tngtech.com>
Subject Re: Checkpoint state stored in backend, and deleting old checkpoint state
Date Tue, 05 Apr 2016 18:54:10 GMT
Hi Zach,

some answers/comments inline.



On 05.04.2016 20:39, Zach Cox wrote:
> Hi - I have some questions regarding Flink's checkpointing, specifically
> related to storing state in the backends.
> So let's say an operator in a streaming job is building up some state.
> When it receives barriers from all of its input streams, does it store
> *all* of its state to the backend? I think that is what the docs [1] and
> paper [2] imply, but want to make sure. In other words, if the operator
> contains 100MB of state, and the backend is HDFS, does the operator copy
> all 100MB of state to HDFS during the checkpoint?

Yes. With the filesystem backend this happens synchronously, with
RocksDB backend the transfer to HDFS is asynchronous.

> Following on this example, say the operator is a global window and is
> storing some state for each unique key observed in the stream of
> messages (e.g. userId). Assume that over time, the number of observed
> unique keys grows, so the size of the state also grows (the window state
> is never purged). Is the entire operator state at the time of each
> checkpoint stored to the backend? So that over time, the size of the
> state stored for each checkpoint to the backend grows? Or is the state
> stored to the backend somehow just the state that changed in some way
> since the last checkpoint?

The complete state is checkpointed. Incremental backups are currently
not supported, but seem to be on the roadmap.

> Are old checkpoint states in the backend ever deleted / cleaned up? That
> is, if all of the state for checkpoint n in the backend is all that is
> needed to restore a failed job, then all state for all checkpoints m < n
> should not be needed any more, right? Can all of those old checkpoints
> be deleted from the backend? Does Flink do this?

To my knowledge flink takes care of deleting old checkpoints (I think it
says so in the documentation about savepoints.). In my experience
though, if a job is cancelled or crashes, the checkpoint files are
usually not cleaned up. So some housekeeping might be necessary.

> Thanks,
> Zach
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.0/internals/stream_checkpointing.html
> [2] http://arxiv.org/abs/1506.08603

Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

View raw message