flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1376) SubSlots are not properly released in case that a TaskManager fatally fails, leaving the system in a corrupted state
Date Wed, 04 Feb 2015 10:30:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304900#comment-14304900
] 

ASF GitHub Bot commented on FLINK-1376:
---------------------------------------

Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/317#issuecomment-72832512
  
    +1 for removing unnecessary ExecutionGraph information before archiving. 


> SubSlots are not properly released in case that a TaskManager fatally fails, leaving
the system in a corrupted state
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-1376
>                 URL: https://issues.apache.org/jira/browse/FLINK-1376
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>
> In case that the TaskManager fatally fails and some of the failing node's slots are SharedSlots,
then the slots are not properly released by the JobManager. This causes that the corresponding
job will not be properly failed, leaving the system in a corrupted state.
> The reason for that is that the AllocatedSlot is not aware of being treated as a SharedSlot
and thus he cannot release the associated SubSlots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message