flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hironori Ogibayashi <ogibaya...@gmail.com>
Subject Re: Handling large state (incremental snapshot?)
Date Tue, 05 Apr 2016 12:40:43 GMT

Thank you for your quick response.
Yes, I am using FsStateBackend, so I will try RocksDB backend.


2016-04-05 21:23 GMT+09:00 Aljoscha Krettek <aljoscha@apache.org>:
> Hi,
> I guess you are using the FsStateBackend, is that correct? You could try
> using the RocksDB state backend:
> https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#the-rocksdbstatebackend
> With this, throughput will be lower but the overhead per checkpoint could be
> lower. Also, with this most of the file copying necessary for the checkpoint
> will be done while data processing keeps running (asynchronous snapshot).
> As to incremental snapshots. I'm afraid this feature is not yet implemented
> but we're working on it.
> Cheers,
> Aljoscha
> On Tue, 5 Apr 2016 at 14:06 Hironori Ogibayashi <ogibayashi@gmail.com>
> wrote:
>> Hello,
>> I am trying to implement windowed distinct count on a stream. In this
>> case, the state
>> have to hold all distinct value in the window, so can be large.
>> In my test, if the state size become about 400MB, checkpointing takes
>> 40sec and spends most of Taskmanager's CPU.
>> Are there any good way to handle this situation?
>> Flink document mentions about incremental snapshot, and I am interested in
>> it,
>> but could not find how to enable it. (not implemented yet?)
>> https://ci.apache.org/projects/flink/flink-docs-release-1.0/internals/stream_checkpointing.html
>> Regards,
>> Hironori

View raw message