flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Flink groupBy
Date Wed, 19 Apr 2017 13:45:27 GMT
Hi Alieh,

Flink uses hash partitioning to assign grouping keys to parallel tasks by
default.
You can implement a custom partitioner or use range partitioning (which has
some overhead) to control the skew.

There is no automatic load balancing happening.

Best, Fabian

2017-04-19 14:42 GMT+02:00 Alieh <saeedi@informatik.uni-leipzig.de>:

> Hi All
>
> Is there anyway in Flink to send a process to a reducer?
>
> If I do "test.groupby(1).reduceGroup", each group is processed on one
> reducer? And if the number of groups is more than the number of task slots
> we have, does Flink distribute the process evenly? I mean if we have for
> example groups of size 10, 5, 5 and we have two task slots, is the process
> distributed in this way?
>
> task slot1: group of size 10
>
> task slot2: two groups of size 5
>
> Best,
>
> Alieh
>
>

Mime
View raw message