flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Question regarding parallelism
Date Wed, 21 Oct 2015 21:34:01 GMT
Hey Jerry,

On Wed, Oct 21, 2015 at 11:11 PM, Jerry Peng <jerry.boyang.peng@gmail.com>
> When I submit the job, the number of task slots that gets used
> (displayed on the UI) is only 20.  Why is that? The total number of
> tasks listed on the ui is 55.

Do you mean the number of task slots is 55 (you just wrote tasks)?

Each task slot runs a pipeline of parallel sub tasks. In your case the
number of used task slots corresponds to the maximum parallelism of the
job, which is 20. You can have a look at [1]. There is a figure giving an

> And also why does the
> filter->project->flatmap get compress into one operator with a
> parallelism of 20?  Can I not set the individual operators (i.e.
> filter and project) to have an individual parallelism of 20?

This is an optimisation, which drastically reduces the overhead for the
data exchange between operators. It skips serialisation and results in a
simple chain of local method calls. This is possible, because all operators
just forward their data. You can disable it via

Does this help?

– Ufuk


View raw message