flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Handling large state (incremental snapshot?)
Date Tue, 05 Apr 2016 12:23:33 GMT
Hi,
I guess you are using the FsStateBackend, is that correct? You could try
using the RocksDB state backend:
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#the-rocksdbstatebackend

With this, throughput will be lower but the overhead per checkpoint could
be lower. Also, with this most of the file copying necessary for the
checkpoint will be done while data processing keeps running (asynchronous
snapshot).

As to incremental snapshots. I'm afraid this feature is not yet implemented
but we're working on it.

Cheers,
Aljoscha

On Tue, 5 Apr 2016 at 14:06 Hironori Ogibayashi <ogibayashi@gmail.com>
wrote:

> Hello,
>
> I am trying to implement windowed distinct count on a stream. In this
> case, the state
> have to hold all distinct value in the window, so can be large.
>
> In my test, if the state size become about 400MB, checkpointing takes
> 40sec and spends most of Taskmanager's CPU.
> Are there any good way to handle this situation?
>
> Flink document mentions about incremental snapshot, and I am interested in
> it,
> but could not find how to enable it. (not implemented yet?)
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.0/internals/stream_checkpointing.html
>
> Regards,
> Hironori
>

Mime
View raw message