flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maximilian Michels (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-3300) Concurrency Bug in Yarn JobManager
Date Mon, 01 Feb 2016 09:12:40 GMT

     [ https://issues.apache.org/jira/browse/FLINK-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Maximilian Michels resolved FLINK-3300.
    Resolution: Fixed

Fixed in 2a49eaaf3c949864457aee0ffd99343a50ac7285.

> Concurrency Bug in Yarn JobManager
> ----------------------------------
>                 Key: FLINK-3300
>                 URL: https://issues.apache.org/jira/browse/FLINK-3300
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.0.0
>            Reporter: Stephan Ewen
>            Assignee: Maximilian Michels
>            Priority: Blocker
>             Fix For: 1.0.0
> The change to use the async ResourceManager client introduced concurrency problems: The
ResourceManager callback threads run and change data structures at the same time as the actor
methods, voiding the actor concurrency model.
> One example that can happen is that the callback tries to start containers while the
ContainerLaunchContext is still not set (because the actor method is still in the StartYarnSession
> Bug introducing commit: https://github.com/apache/flink/commit/4e52fe4304566e5239996b3d48290e0c1f0772e8
> Quick fix could be to revert the commit. Better solution would be to let the callback
methods send actor messages to the YobManager, rather than directly acting.

This message was sent by Atlassian JIRA

View raw message