flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9845) Make InternalTimerService's timer processing interruptible/abortable
Date Wed, 10 Oct 2018 07:43:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644582#comment-16644582

Till Rohrmann commented on FLINK-9845:

Hi [~SleePy], great to hear that you want to help with this issue :-) Feel free to take it.

> Make InternalTimerService's timer processing interruptible/abortable
> --------------------------------------------------------------------
>                 Key: FLINK-9845
>                 URL: https://issues.apache.org/jira/browse/FLINK-9845
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.5.1, 1.6.0
>            Reporter: Till Rohrmann
>            Priority: Major
>             Fix For: 1.7.0
> When cancelling a {{Task}}, the task thread might currently process the timers registered
at the {{InternalTimerService}}. Depending on the timer action, this might take a while and,
thus, blocks the cancellation of the {{Task}}. In the most extreme case, the {{TaskCancelerWatchDog}}
kicks in and kills the whole {{TaskManager}} process.
> In order to alleviate the problem (speed up the cancellation reaction), we should make
the processing of the timers interruptible/abortable. This means that instead of processing
all timers we should check in between timers whether the {{Task}} is currently being cancelled
or not. If this is the case, then we should directly stop processing the remaining timers
and return.

This message was sent by Atlassian JIRA

View raw message