flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: KeyedSream question
Date Thu, 05 Apr 2018 09:30:46 GMT
Amit is correct. keyBy() ensures that all records with the same key are
processed by the same paralllel instance of a function.
This is different from "a parallel instance only sees records of one key".

I had a look at the docs [1].
I agree that "Logically partitions a stream into disjoint partitions, each
partition containing elements of the same key." can be easily interpreted
as you did.
I've pushed a commit to clarify the description. The docs should be updated

Best, Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/

2018-04-05 6:21 GMT+02:00 Amit Jain <aj2011it@gmail.com>:

> Hi,
> KeyBy operation partition the data on given key and make sure same slot
> will
> get all future data belonging to same key. In default implementation, it
> can
> also map subset of keys in your DataStream to same slot.
> Assuming you have number of keys equal to number running slot then you may
> specify your custom keyBy operation to the achieve the same.
> Could you specify your case.
> --
> Thanks
> Amit
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/

View raw message