flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work
Date Tue, 19 Jul 2016 11:48:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384005#comment-15384005

ASF GitHub Bot commented on FLINK-4152:

Github user tillrohrmann commented on a diff in the pull request:

    --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java
    @@ -78,6 +79,9 @@
     	/** The containers where a TaskManager is starting and we are waiting for it to register
     	private final Map<ResourceID, YarnContainerInLaunch> containersInLaunch;
    +	/** The container where a TaskManager has been started and is running in */
    +	private final Map<ResourceID, Container> containersLaunched;
    --- End diff --
    It is true that it holds the registered resources but it does not hold the launched containers.
When a `JobManager` loses its leadership the list of registered workers will be cleared. In
order to reconstruct the mapping `ResourceID --> Container`, you need this new map.

> TaskManager registration exponential backoff doesn't work
> ---------------------------------------------------------
>                 Key: FLINK-4152
>                 URL: https://issues.apache.org/jira/browse/FLINK-4152
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination, TaskManager, YARN Client
>            Reporter: Robert Metzger
>            Assignee: Till Rohrmann
>         Attachments: logs.tgz
> While testing Flink 1.1 I've found that the TaskManagers are logging many messages when
registering at the JobManager.
> This is the log file: https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that this is
the expected behavior.

This message was sent by Atlassian JIRA

View raw message