flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8749) Release slots when scheduling operation is canceled in ExecutionGraph
Date Thu, 22 Feb 2018 16:18:03 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373008#comment-16373008

ASF GitHub Bot commented on FLINK-8749:

GitHub user tillrohrmann opened a pull request:


    [FLINK-8749] [flip6] Release slots when scheduling operation is canceled

    ## What is the purpose of the change
    Release slots when the scheduling operation is canceled in the `ExecutionGraph`.
    ## Brief changelog
    - Added `SlotProvider#cancelSlotRequest`
    - Adapted `ConjunctFuture` to cancel the individual futures of the conjunction
    ## Verifying this change
    Tested manually.
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (no)
      - The S3 file system connector: (no)
    ## Documentation
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink hardenRescaling

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5562
commit 8cb296e2a5c9ddc2234e04791f7aab8eaec73b1b
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-21T14:57:50Z

    [FLINK-8732] [flip6] Cancel ongoing scheduling operation
    Keeps track of ongoing scheduling operations in the ExecutionGraph and cancels
    them in case of a concurrent cancel, suspend or fail call. This makes sure that
    the original cause for termination is maintained.
    This closes #5548.

commit b1dd80ccbe2c2a71758524d5d6f0ffa5fdd84a30
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-01T13:37:15Z

    [hotfix] Fix checkstyle violations in ExecutionGraph

commit f17c50bd5c36270928267e3ed6ca6fb2ffea0ccc
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-01T17:04:06Z

    [FLINK-8627] Introduce new JobStatus#SUSPENDING to ExecutionGraph
    The new JobStatus#SUSPENDING says that an ExecutionGraph has been suspended but its
    clean up has not been done yet. Only after all Executions have been canceled, the
    ExecutionGraph will enter the SUSPENDED state and complete the termination future
    This closes #5445.

commit 0a6973ba32c0bd1a3e8a3f0af3ed2bac7e4917d9
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-13T15:14:41Z

    [FLINK-8629] [flip6] Allow JobMaster to rescale jobs
    This commit adds the functionality to rescale a job or parts of it to
    the JobMaster. In order to rescale a job, the JobMaster does the following:
    1. Take a savepoint
    2. Create a rescaled ExecutionGraph from the JobGraph
    3. Initialize it with the taken savepoint
    4. Suspend the old ExecutionGraph
    5. Restart the new ExecutionGraph once the old ExecutionGraph has been suspended
    This closes #5446.

commit 9c29e815b960796c33511a14483848f52a2454c5
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-13T15:34:31Z

    [FLINK-8633] [flip6] Expose rescaling of jobs via the Dispatcher
    This commit exposes the JobMaster#rescaleJob via the Dispatcher. This will
    allow it to call this functionality from a REST handler.
    This closes #5452.

commit b3e65c6914970bdce20b1fa572655403200ae2a1
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-02T10:06:35Z

    [FLINK-8634] [rest] Introduce job rescaling REST handler
    Add rescaling REST handler as a sub class of the
    This closes #5451.

commit 608a9204be0fcec8ba771ca3688586deadbadc5e
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-11T18:50:46Z

    [FLINK-8635] [rest] Register rescaling handlers at web endpoint
    This closes #5454.

commit cd27bf03a954c23c3879f81eadfb4af89f2e4a91
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-13T16:29:32Z

    [FLINK-8656] [flip6] Add modify CLI command to rescale Flink jobs
    Jobs can now be rescaled by calling flink modify <JOB_ID> -p <PARALLELISM>.
    Internally, the CliFrontend will send the corresponding REST call and poll
    for status updates.
    This closes #5487.

commit 16e88e61aa0172e9de59cfa3756f230c045777a4
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T11:36:53Z

    [FLINK-8746] [flip6] Allow rescaling of partially running jobs
    This commit enables the rescaling of Flink jobs which are currently not fully
    deployed. In such a case, Flink will use the last internal rescaling savepoint.
    If there is no such savepoint, then it will use the provided savepoint when the
    job was submitted. In case that there is no savepoint at all, then it will restart
    the job with vanilla state.

commit 0ac1b3dabb4e73d08d2198ab56b961201b1e87cf
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T13:12:48Z

    [hotfix] Register job status listener for rescaled job

commit 3a09100df0013eb0abec255efb4a4e09fccf1903
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T13:10:29Z

    [FLINK-8748] [flip6] Cancel slot allocations for alternatively completed slot requests
    If a slot request is fulfilled with a different AllocatedSlot in the SlotPool,
    then we cancel the slot request sent to the ResourceManager.

commit 28fe5008d3e2dc8b98d6dd2e947eec1ce3ee1941
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T14:37:28Z

    [hotfix] Avoid redundant slot release operations

commit 970fae405fb00f5e56481d72ee247cdedb5c4d57
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T14:37:37Z

    [hotfix] Cancel pending slot request when SlotPool is suspended

commit 9f6b9bc874e437ce417b438f0d1b579628383005
Author: Till Rohrmann <trohrmann@...>
Date:   2018-02-22T16:01:05Z

    [FLINK-8749] [flip6] Release slots when scheduling operation is canceled
    Release slots when the scheduling operation is canceled in the ExecutionGraph.


> Release slots when scheduling operation is canceled in ExecutionGraph
> ---------------------------------------------------------------------
>                 Key: FLINK-8749
>                 URL: https://issues.apache.org/jira/browse/FLINK-8749
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Major
>              Labels: flip-6
>             Fix For: 1.5.0
> In order to quickly release slots, we should explicitly return them to the {{SlotProvider}}
if the scheduling operation is cancelled in the {{ExecutionGraph}}.

This message was sent by Atlassian JIRA

View raw message