Hi Bastien,
Each key “belongs” to exactly one parallel instance of a keyed operator,
and each parallel instance contains one or more Key Groups.
Keys will be hashed into the corresponding key group deterministically. It
is hashed by the value instead of the number of the total records.
Different keys do not affect each other even a parallel instance contains
one or more Key Groups.
Best, Hequn
On Wed, Dec 12, 2018 at 6:21 PM bastien dine <bastien.dine@gmail.com> wrote:
> Hello everyone,
>
> I have a question regarding the key state & parallelism of a process
> operation
>
> Doc says : "You can think of Keyed State as Operator State that has been
> partitioned, or sharded, with exactly one statepartition per key. Each
> keyedstate is logically bound to a unique composite of
> <paralleloperatorinstance, key>, and since each key “belongs” to exactly
> one parallel instance of a keyed operator, we can think of this simply as
> <operator, key>."
>
> If I have less parallel operator instance (say 5) than my number of
> possible key (10), it means than every instance will "manage" 2 key state ?
> (is this spread evenly ?)
> Is the logical bound fixed ? I mean, are the state always managed by the
> same instance, or does this depends on the available instance at the moment
> ?
>
> "During execution each parallel instance of a keyed operator works with
> the keys for one or more Key Groups."
> > this is related, does "works with the keys" means always the same keys ?
>
> Best Regards,
> Bastien
>
> 
>
> Bastien DINE
> Data Architect / Software Engineer / Sysadmin
> bastiendine.io
>
