flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niklas Semmler <nsemm...@inet.tu-berlin.de>
Subject Re: Scheduling in BATCH execution mode?
Date Thu, 10 Sep 2015 07:29:54 GMT
Hello Ufuk,

thanks that makes very much sense :).


On 09.09.2015 23:22, Ufuk Celebi wrote:
>> On 09 Sep 2015, at 19:31, Niklas Semmler <nsemmler@inet.tu-berlin.de> wrote:
>> Hello Ufuk,
>> thanks for you amazingly quick reply.
>> I have seen the markFinished in Execution.java, but if I get it right, this is simply
used to stop a task. The ScheduleOrUpdateConsumers message in the pipeline case on the other
hand is notifying the consumers that a pipelined partition is ready and can now be consumed.
Can you give me a hint on how the receiver is notified of a consumable partition in the batch
> Yes, this transitions the state of the respective execution to FINISHED. But if you look
closely there is a call “finishAllBlockingPartitions”, which schedules the receivers,
iff it the last execution finishes the  result.
> You can think about the intermediate data as follows:
> - On the job graph level you have: (Operator) -> [Result] -> (Operator), e.g. (map)
-> [map result] -> (reduce)
> - At runtime when there are multiple parallel tasks, you have:
>    * [Operator subtask 0] -> [Result partition 0] -> (Operator subtask 0)
>    * [Operator subtask 1] -> [Result partition 1] -> (Operator subtask 1)
> Now, if the exchange is blocking (in batch execution mode), the result is finished, iff
all subtasks producing it have finished.
> The question now is: when is the result finished? It is finished, iff all subtasks producing
it have finished. And that’s where the markFinished comes into play. The last to finish
triggers the scheduling. This is non-deterministic, i.e. either subtask 0 or 1 can be the
last to finish it. This is kept track of via a simple counter. The task decrementing it to
0 triggers the scheduling.
> (If the result is pipelined, the first data point triggers the scheduling already via
the code paths you have looked at.)
> Does this help?
> – Ufuk

PhD Student / Research Assistant
INET, TU Berlin
Room 4.029
Marchstr 23
10587 Berlin
Tel: +49 30 314 75739

View raw message