flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Change in the JobManager API
Date Sat, 20 Sep 2014 18:25:18 GMT
Edit: I have not pushed it, I am about to push ;-)

Just needed to rebase on the latest master an tests are pending...

On Sat, Sep 20, 2014 at 8:24 PM, Stephan Ewen <sewen@apache.org> wrote:

> Hi!
>
> I have just pushed a big patch to rework the JobManager job and scheduling
> classes. It fixes some scalability and robstness issues,
> simplifies the task hierarchies, and makes the code ready for some of the
> prepared next features (incremental/interactive jobs).
>
> The pull request is https://github.com/apache/incubator-flink/pull/122
>
> What will affect developers that go against the lower level APIs (like the
> streaming parts) is the following:
>
>  - No more distrinction between input/intermediate/output tasks
>  - Intermediate data sets have a data structure now. This implies that
> some methods change slightly (more in name than in meaning).
>    In the future, data sets can be consumed many times, but for now, the
> network stack supports only one cosumer.
>  - The conceptual change that receivers attach senders as inputs (and grab
> their outgoing data streams), rather than senders forwarding to
>    receivers means that the wiring of JobGraphs is now the other way
> around.
>  - No more distinction between in-memory and network channels. All
> channels have always been automatically in-memory, when senders
>    and receiver are co-located. The flag was purely a scheduler hint,
> which is obsolete now (see below).
>
>
> Most importantly:
>  - The scheduling is a bit different now. Instread of instance sharing, we
> now have SlotSharing Groups, which give you
>    a way to share resources across tasks, but they behave more dynamic,
> which is important for more dynamic environments,
>    and when a cluster has less task slots than the parallelism of some
> tasks is.
>  - For cases that need strict co-location of tasks, we now have
> CoLocationConstraints. The Batch API uses them to ensure that
>    head, tail, and tasks inside a closed-loop iteration are co-located.
>
>  Stephan
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message