flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nico Kruber <n...@data-artisans.com>
Subject Re: state inside functions
Date Thu, 03 Aug 2017 14:00:20 GMT
Hi Peter,
there's no need to worry about transient members as the operator itself is not 
serialized - only the state itself, depending on the state back-end.

If you want your state to be recovered by checkpoints you should implement the 
open() method and initialise your state there as in your point (2) and as 
described in [1].

If you want to re-scale your job, you have to take a savepoint and may resume 
from there with a different parallelism [2] but be sure to set a maximum 
parallelism (per job / or operator) and set UUIDs for operators as described 
in [3].


Nico

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/
state.html
[2] https://ci.apache.org/projects/flink/flink-docs-release-1.4/setup/
savepoints.html
[3] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/
production_ready.html

On Thursday, 3 August 2017 12:11:14 CEST Peter Ertl wrote:
> Hi,
> 
> can someone elaborate on when I should set properties transient /
> non-transient within operators (e.g. map / flatMap / reduce) ?
> 
> I see these two possibilies:
> 
> (1) initialize a non-transient property from the constructor
> (2) initialize a transient property inside a Rich???Function when
> open(ConfigurationParameters) is invoked
> 
> on what criteria should I choose (1) or (2) ?
> 
> how is this related to checkpointing / rebalancing?
> 
> Thanks in advance
> Peter


Mime
View raw message