flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4933) ExecutionGraph.scheduleOrUpdateConsumers can fail the ExecutionGraph
Date Fri, 28 Oct 2016 13:56:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615486#comment-15615486
] 

ASF GitHub Bot commented on FLINK-4933:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2700#discussion_r85533357
  
    --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
---
    @@ -917,8 +917,15 @@ class JobManager(
         case ScheduleOrUpdateConsumers(jobId, partitionId) =>
           currentJobs.get(jobId) match {
             case Some((executionGraph, _)) =>
    -          sender ! decorateMessage(Acknowledge)
    -          executionGraph.scheduleOrUpdateConsumers(partitionId)
    +          try {
    +            executionGraph.scheduleOrUpdateConsumers(partitionId)
    +            sender ! decorateMessage(Acknowledge)
    +          } catch {
    +            case e: ExecutionGraphException =>
    --- End diff --
    
    Does it make sense to catch the more generic `Exception` type here in order to make the
sender notice any problems sooner? I see that the method only throws EGExceptions currently
but maybe at some point in time someone introduces a runtime exception etc. This would only
be logged at the JM and the task's ask would timeout.


> ExecutionGraph.scheduleOrUpdateConsumers can fail the ExecutionGraph
> --------------------------------------------------------------------
>
>                 Key: FLINK-4933
>                 URL: https://issues.apache.org/jira/browse/FLINK-4933
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>             Fix For: 1.2.0, 1.1.4
>
>
> Currently the {{ExecutionGraph.scheduleOrUpdateConsumers}} can fail the whole {{ExecutionGraph}}
if it cannot find the corresponding {{Execution}}. This situation can occur in the restarting
scenario where we have a late callback trying to update its consumers. In this case, the call
should forward the exception back to the caller and not fail the {{ExecutionGraph}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message