flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-2079) Add watcher to YARN TM containers to detect stopped actor system
Date Wed, 27 May 2015 07:59:17 GMT

     [ https://issues.apache.org/jira/browse/FLINK-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Metzger resolved FLINK-2079.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 0.9

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/11b021b0

> Add watcher to YARN TM containers to detect stopped actor system
> ----------------------------------------------------------------
>
>                 Key: FLINK-2079
>                 URL: https://issues.apache.org/jira/browse/FLINK-2079
>             Project: Flink
>          Issue Type: Improvement
>          Components: TaskManager, YARN Client
>    Affects Versions: 0.9
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>             Fix For: 0.9
>
>
> I experienced an OutOfMemoryError (caused by the usercode) while running Flink on YARN.
> It seems that the TaskManager is correctly detecting the fatal error, however the JVM
is not shutting down, so YARN won't bring up new containers.
> Therefore, I want to start a thread on the YarnTaskManagerRunner which periodically (every
30 seconds) checks whether the actor system is still running. If not, its doing a System.exit(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message