flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Latta <mla...@technomage.com>
Subject Re: KeyedSream question
Date Fri, 06 Apr 2018 11:01:52 GMT
Yes. It took a bit of digging in the website to find RichFlatMapFunction to get managed state.


Michael

Sent from my iPad

> On Apr 6, 2018, at 3:29 AM, Fabian Hueske <fhueske@gmail.com> wrote:
> 
> Hi, 
> 
> I think Flink is exactly doing what you are looking for.
> If you use keyed state [1], Flink will put the state always in the context of the key
of the currently processed record.
> So if you have a MapFunction with keyed state, and the map() method is called with a
record that has a key A, the state will be the state for key A. If the next record has a key
B, the state will be for key B.
> 
> Best,
> Fabian
> 
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html#keyed-state
> 
> 2018-04-05 14:08 GMT+02:00 Michael Latta <mlatta@technomage.com>:
>> Thanks for the clarification. I was just trying to understand the intended behavior.
It would have been nice if Flink tracked state for downstream operators by key, but I can
do that with a map in the downstream functions. 
>> 
>> Michael
>> 
>> Sent from my iPad
>> 
>>> On Apr 5, 2018, at 2:30 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>>> 
>>> Amit is correct. keyBy() ensures that all records with the same key are processed
by the same paralllel instance of a function.
>>> This is different from "a parallel instance only sees records of one key".
>>> 
>>> I had a look at the docs [1]. 
>>> I agree that "Logically partitions a stream into disjoint partitions, each partition
containing elements of the same key." can be easily interpreted as you did.
>>> I've pushed a commit to clarify the description. The docs should be updated soon.
>>> 
>>> Best, Fabian 
>>> 
>>> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/#datastream-transformations
>>> 
>>> 2018-04-05 6:21 GMT+02:00 Amit Jain <aj2011it@gmail.com>:
>>>> Hi,
>>>> 
>>>> KeyBy operation partition the data on given key and make sure same slot will
>>>> get all future data belonging to same key. In default implementation, it
can
>>>> also map subset of keys in your DataStream to same slot.
>>>> 
>>>> Assuming you have number of keys equal to number running slot then you may
>>>> specify your custom keyBy operation to the achieve the same.
>>>> 
>>>> 
>>>> Could you specify your case.
>>>> 
>>>> --
>>>> Thanks
>>>> Amit
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>> 
> 

Mime
View raw message