flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Parallelism and Partitioning
Date Mon, 06 Feb 2017 15:41:12 GMT
On Fri, Feb 3, 2017 at 2:09 AM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:
> What is the granularity of parallelism in flink? For eg: if I am reading
> from Kafka -> map -> Sink and I say parallel 2 does it mean it creates 2
> consumer threads and allocates it on 2 separate task managers?

Yes, this will instantiate two instances of the Kafka source, the map
operator, and the sink. These parallel sub pipelines will be scheduled
to separate slots (that might happen to on the same TM). See here for
more details: https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/job_scheduling.html

Partitioning of state happens in key groups, which define a range of
keys. A single subtask is usually responsible for more than a single
key group. The key groups are the units of rescaling your program.
This is configurable via the setMaxParallelism() method on the
environment.

Mime
View raw message