flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Navneeth Krishnan <reachnavnee...@gmail.com>
Subject State Maintenance
Date Tue, 05 Sep 2017 04:36:54 GMT
Hi All,

I have couple of questions regarding state maintenance in flink.

- I have a connected stream and then a keyby operator followed by a flatmap
function. I use MapState and keys get added by data from stream1 and
removed by messges from stream2. Stream2 acts as a control stream in my
pipeline. My question is when the keys are removed will the state in
rocksdb also be removed? How does rocks db get the most recent state?

- Can I use guava cache in MapState like MapState<String, Cache<String,
String>>? Do I have to write a serializer to persist data from guava cache?

- One of my downstream operator requires keyed state because I need to
query the state value but it also has two huge state values that are
basically the same across all parallel operator instances. Initially I used
operator state and checkpoint only in the 0th index of operator and other
instances would not checkpoint the same data. How can I achieve this in
Keyed State? Each operator will have around 10GB of same data. Not sure if
this will be a problem in future.

Thanks,
Navneeth

Mime
View raw message