flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Flink slots, threads, task, etc
Date Tue, 18 Apr 2017 15:24:03 GMT
sorry for not getting any responses but I think everyone was quite busy with Flink Forward
SF. I’m also no expert on the topic but I’ll try and give some answers.

Regarding a Google Doc version, I don’t think that there is any. You would have to modify
the Markdown version we have in the doc.

For the other answers I’ll reuse an example program that consists of Source -> Map ->
Sink, with chaining disabled and parallelism 2. We’ll this have three Tasks: Source, Map,
and Sink, with each having two subtasks. Let’s denote the subtasks by a number in parenthesis
so the first subtask for Source is Source(1), second one is Source(2). I’ll also refer to
Source(1) -> Map(1) -> Sink(1) as a slice of the execution graph since these can be
executed within one slot.

Regarding 1, I think this is true. However, a single slot can execute a complete slice of
the execution graph where each subtask (from a different task) would be executed by its own

Regarding 2.1, Yes, I think it cannot run multiple subtasks of the same task while it is possible
(and in fact done) to execute all the subtasks of a slide in the same slot.

Regarding 2.2, This is so to allow executing a pipeline of parallelism 8 using a cluster that
has 8 free slots. Basically, each slice fills one slot.

Regarding 3, I don’t really have an answer.

Regarding 4, Yes, this can get a bit out of hand if you have very long pipelines.

> On 11. Apr 2017, at 14:37, Flavio Pompermaier <pompermaier@okkam.it> wrote:
> Any feedback here..?
> On Wed, Apr 5, 2017 at 7:43 PM, Flavio Pompermaier <pompermaier@okkam.it <mailto:pompermaier@okkam.it>>
> Hi to all,
> I had a very long but useful chat with Fabian and I understood a lot of concepts that
was not clear at all to me. We started from the Flink runtime documentation page (https://ci.apache.org/projects/flink/flink-docs-release-1.2/concepts/runtime.html
> I discovered that the terminology is very inconsistent and misleading along the page...
> For example, one of the very first sentences is :
> "Flink chains operator subtasks together into tasks. Each task is executed by one thread."
> What I first understood was that every operator can be executed only by a single thread
in all the cluster....probably it should be better "one thread per task slot" (at least).

> Moreover, if I'm not wrong, a Task Slot can execute only 1 subtask (aka parallel instance)
of each task and there's no limit to the number of subtasks per slot (and this is not highlighted
at all in that document). The only constraint is that they should belong to different tasks
> If there's a google doc version of that page I could try to rewrite it down in order
to make it easier to understand some parts...however I still have some more questions:
> Is it correct that a single Task Slot can execute only a single subtask of each task
and that this task is executed by a single thread within the slot)?
> If it so:
> why at that page there's written "By default, Flink allows subtasks to share slots even
if they are subtasks of different tasks, so long as they are from the same job"? It seems
that it is more common to run multiple subtasks of the same task (in a slot) than executing
different substasks of different tasks, although this is still permitted...from what I understood
a slot cannot run multiple subtask of the same task at all!
> and why this constraint? Is there any good reason for that? A subtask is mapped to 1
thread in the TaskManager, so why a TM with 2 slots can run 2 subtasks of the same task (in
the same JVM) while a TM with 1 slot cannot  (while it can execute an arbitrary number of
subtasks of different tasks)? 
> It it is not so, there's no images representing such a situation in that page...
> Isn't dangerous to allow (potentially) an unlimited number of threads per TM slot?? 
> Cheers,
> Flavio

View raw message