flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Flink streaming with 1+ TB of managed state
Date Sat, 19 Nov 2016 12:16:01 GMT
Hi Steven,

According to this presentation, King.com is using Flink with terabytes of
state:
http://flink-forward.org/wp-content/uploads/2016/07/Gyulo-Fo%CC%81ra-RBEA-Scalable-Real-Time-Analytics-at-King.compressed.pdf
(see Page 4 specifically)

For the 90GB experiment, what is the expected time for transferring 90 GB
of data in your environment?

Regards,
Robert


On Sat, Nov 19, 2016 at 1:41 AM, Steven Ruppert <steven@fullcontact.com>
wrote:

> Hi,
>
> Is anybody currently running flink streaming with north of a terabyte
> (TB) of managed state? If you are, can you share your experiences wrt
> hardware, tuning, recovery situations, etc?
>
> I'm evaluating flink for a use case I estimate will take around 5TB of
> state in total, but looking at the actual implementation of the
> rocksDB state and current lack of incremental checkpointing or
> recovery, it doesn't seem feasible.
>
> I have successfully tested flink up to roughly 90GB of managed state
> in rocksDB, but that's taking 5 minutes to checkpoint or recover (on a
> pretty beefy YARN cluster).
>
> For most cases, my state updates are idempotent and can be moved to
> something external. However, it'd be nice to know of any current of
> future plans for running flink at the terabyte scale.
>
> --Steven
>
> --
> *CONFIDENTIALITY NOTICE: This email message, and any documents, files or
> previous e-mail messages attached to it is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.*
>

Mime
View raw message