flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5179) MetricRegistry life-cycle issues with HA
Date Mon, 05 Dec 2016 10:39:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721904#comment-15721904

ASF GitHub Bot commented on FLINK-5179:

Github user uce commented on the issue:

    I tested this, but could not reproduce the reported erroneous behaviour. After the task
manager reconnects to a new job manager, the metrics are still being reported. I tested this
with JMX and Log4j reporters. The changes look reasonable to me though. @rmetzger @zentol
Should we merge this for 1.1.4?

> MetricRegistry life-cycle issues with HA
> ----------------------------------------
>                 Key: FLINK-5179
>                 URL: https://issues.apache.org/jira/browse/FLINK-5179
>             Project: Flink
>          Issue Type: Bug
>          Components: Metrics
>    Affects Versions: 1.1.3
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Blocker
>             Fix For: 1.2.0, 1.1.4
> The TaskManager's MetricRegistry is started when the TaskManager is created, and shutdown
in the TaskManager's postStop method.
> However, the registry is also shutdown within the TaskManager's disassociateFromJobManager
method; however it is not restarted when the connection is re-established.
> Effectively this means that a TaskManager that ever reconnected to a JobManager will
not report any metrics, since the reporters are shutdown as well. Metrics will neither be
sent to the WebInterface anymore.

This message was sent by Atlassian JIRA

View raw message