flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xingcan Cui <xingc...@gmail.com>
Subject Re: CoProcess() VS union.Process() & Timers in them
Date Tue, 13 Feb 2018 12:43:53 GMT
Hi Max,

Currently, the timers can only be used with keyed streams. As @Fabian suggested, you can “forge”
a keyed stream with the special KeySelector, which maps all the records to the same key.

IMO, Flink uses keyed streams/states as it’s a deterministic distribution mechanism. Here,
“the parallelism changes” may also refer to a parallelism change after the job restarts
(e.g., when a node crashes). Flink can make sure that all the processing tasks and states
will be safely re-distributed across the new cluster.

Hope that helps.


> On 13 Feb 2018, at 5:18 PM, m@xi <makisntpap@gmail.com> wrote:
> OK Great!
> Thanks a lot for the super ultra fast answer Fabian!
> One intuitive follow-up question.
> So, keyed state is the most preferable one, as it is easy for the Flink
> System to perform the re-distribution in case of change in parallelism, if
> we have a scale-up or scale-down. Also, it is useful to use hash partition a
> stream to different nodes/processors/PU (Processing Units) in general, by
> Keyed State.
> Any other reasons for making Keyed State a must?
> Last but not least, can you elaborate further on the "when the parallelism
> changes" part. I have read this in many topics in this forum, but I cannot
> understand its essence. For example, I define the parallelism of each
> operator in my Flink Job program based on the number of available PU. Maybe
> the essence lies in the fast that the number of PU might change from time to
> time, e.g. add more servers to the cluster where Flink runs and without
> stopping the Flink Job that runs you may perform the rescaling.
> Thanks in advance.
> Best,
> Max
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

View raw message