flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Apache Flink Operator State as Query Cache
Date Thu, 12 Nov 2015 10:34:07 GMT
Hi,
I don’t know yet when the operator state will be transitioned to managed memory but it could
happen for 1.0 (which will come after 0.10). The good thing is that the interfaces won’t
change, so state can be used as it is now.

For 0.10, the release vote is winding down right now, so you can expect the release to happen
today or tomorrow. I think the streaming is production ready now, we expect to mostly to hardening
and some infrastructure changes (for example annotations that specify API stability) for the
1.0 release. 

Let us know if you need more information.

Cheers,
Aljoscha
> On 12 Nov 2015, at 02:42, Welly Tambunan <if05041@gmail.com> wrote:
> 
> Hi Stephan, 
> 
> >Storing the model in OperatorState is a good idea, if you can. On the roadmap is
to migrate the operator state to managed memory as well, so that should take care of the GC
issues.
> Is this using off the heap memory ? Which version we expect this one to be available
? 
> 
> Another question is when will the release version of 0.10 will be out ? We would love
to upgrade to that one when it's available. That version will be a production ready streaming
right ?
> 
> 
> 
> 
> 
> On Wed, Nov 11, 2015 at 4:49 PM, Stephan Ewen <sewen@apache.org> wrote:
> Hi!
> 
> In general, if you can keep state in Flink, you get better throughput/latency/consistency
and have one less system to worry about (external k/v store). State outside means that the
Flink processes can be slimmer and need fewer resources and as such recover a bit faster.
There are use cases for that as well.
> 
> Storing the model in OperatorState is a good idea, if you can. On the roadmap is to migrate
the operator state to managed memory as well, so that should take care of the GC issues.
> 
> We are just adding functionality to make the Key/Value operator state usable in CoMap/CoFlatMap
as well (currently it only works in windows and in Map/FlatMap/Filter functions over the KeyedStream).
> Until the, you should be able to use a simple Java HashMap and use the "Checkpointed"
interface to get it persistent.
> 
> Greetings,
> Stephan
> 
> 
> On Sun, Nov 8, 2015 at 10:11 AM, Welly Tambunan <if05041@gmail.com> wrote:
> Thanks for the answer. 
> 
> Currently the approach that i'm using right now is creating a base/marker interface to
stream different type of message to the same operator. Not sure about the performance hit
about this compare to the CoFlatMap function. 
> 
> Basically this one is providing query cache, so i'm thinking instead of using in memory
cache like redis, ignite etc, i can just use operator state for this one. 
> 
> I just want to gauge do i need to use memory cache or operator state would be just fine.

> 
> However i'm concern about the Gen 2 Garbage Collection for caching our own state without
using operator state. Is there any clarification on that one ? 
> 
> 
> 
> On Sat, Nov 7, 2015 at 12:38 AM, Anwar Rizal <anrizal05@gmail.com> wrote:
> 
> Let me understand your case better here. You have a stream of model and stream of data.
To process the data, you will need a way to access your model from the subsequent stream operations
(map, filter, flatmap, ..).
> I'm not sure in which case Operator State is a good choice, but I think you can also
live without.
> 
> val modelStream = .... // get the model stream
> val dataStream   =  
> 
> modelStream.broadcast.connect(dataStream). coFlatMap(  ) Then you can keep the latest
model in a CoFlatMapRichFunction, not necessarily as Operator State, although maybe OperatorState
is a good choice too. 
> 
> Does it make sense to you ?
> 
> Anwar
> 
> On Fri, Nov 6, 2015 at 10:21 AM, Welly Tambunan <if05041@gmail.com> wrote:
> Hi All, 
> 
> We have a high density data that required a downsample. However this downsample model
is very flexible based on the client device and user interaction. So it will be wasteful to
precompute and store to db. 
> 
> So we want to use Apache Flink to do downsampling and cache the result for subsequent
query. 
> 
> We are considering using Flink Operator state for that one. 
> 
> Is that the right approach to use that for memory cache ? Or if that preferable using
memory cache like redis etc. 
> 
> Any comments will be appreciated. 
> 
> 
> Cheers
> -- 
> Welly Tambunan
> Triplelands 
> 
> http://weltam.wordpress.com
> http://www.triplelands.com
> 
> 
> 
> 
> -- 
> Welly Tambunan
> Triplelands 
> 
> http://weltam.wordpress.com
> http://www.triplelands.com
> 
> 
> 
> 
> -- 
> Welly Tambunan
> Triplelands 
> 
> http://weltam.wordpress.com
> http://www.triplelands.com


Mime
View raw message