flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From StephanEwen <...@git.apache.org>
Subject [GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...
Date Sun, 12 Feb 2017 23:24:15 GMT
GitHub user StephanEwen opened a pull request:


    [FLINK-5747] [distributed coordination] Eager scheduling allocates slots and deploys tasks
in bulk

    ## Problem Addressed
    Currently, eager scheduling immediately triggers the scheduling for all vertices and their
subtasks in topological order.
    This has two problems:
      - This works only, as long as resource acquisition is "synchronous". With dynamic resource
acquisition in FLIP-6, the resources are returned as Futures which may complete out of order.
This results in out-of-order (not in topological order) scheduling of tasks which does not
work for streaming.
      - Deploying some tasks that depend on other tasks before it is clear that the other
tasks have resources as well leads to situations where many deploy/recovery cycles happen
before enough resources are available to get the job running fully.
    ## Implemented Change
      - The `Execution` has separate methods to allocate a resource and to deploy the task
to that resource
      - The **eager** scheduling mode allocates all resources in one chunk and then deploys
once all resources are available.
    As a utility, this implements the `FutureUtils.combineAll` method that combines the Futures
of the individual resources to a combined Future.
    ## Tests
    The main tests are in `ExecutionGraphSchedulingTest`. The used utilities are tested in
`FutureUtilsTest` and in `ExecutionGraphUtilsTest`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink slot_scheduling

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3295
commit 1f18cbb0d6d119fa5e5c4803201c28887b90cef5
Author: Stephan Ewen <sewen@apache.org>
Date:   2017-02-03T19:26:23Z

    [FLINK-5747] [distributed coordination] Eager scheduling allocates slots and deploys tasks
in bulk
    That way, strictly topological deployment can be guaranteed.
    Also, many quick deploy/not-enough-resources/fail/recover cycles can be
    avoided in the cases where resources need some time to appear.


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message