flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work
Date Mon, 18 Jul 2016 10:12:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382036#comment-15382036

ASF GitHub Bot commented on FLINK-4152:

Github user mxm commented on the issue:

    Thank you for the pull request! Looking at the changes, it looks like it could have been
broken up into two pull requests and jira issues. 1) Avoiding duplicate RegisterTaskManager
messages 2) Changing core behavior of the ResourceManager.
    Concerning 2, I would like to understand why it was necessary to change so much code.
It seems like it would have sufficed to change one line of code (not clearing the bookkeeping
on leader ship change). I'm not saying your changes don't make sense but I don't think they
are backed by the original JIRA issue.
    I'm not sure about the role change of the RM in this PR. The RM should be the authority
for allocating new resources. If those resources are not properly reported back to the RM
(e.g. message loss), the resource allocation won't work properly. 

> TaskManager registration exponential backoff doesn't work
> ---------------------------------------------------------
>                 Key: FLINK-4152
>                 URL: https://issues.apache.org/jira/browse/FLINK-4152
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination, TaskManager, YARN Client
>            Reporter: Robert Metzger
>            Assignee: Till Rohrmann
>         Attachments: logs.tgz
> While testing Flink 1.1 I've found that the TaskManagers are logging many messages when
registering at the JobManager.
> This is the log file: https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that this is
the expected behavior.

This message was sent by Atlassian JIRA

View raw message