nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Gilman (JIRA)" <>
Subject [jira] [Updated] (NIFI-1457) excessive bulletins and logging when primary node is revoked by NCM.
Date Mon, 01 Feb 2016 16:23:39 GMT


Matt Gilman updated NIFI-1457:
    Fix Version/s: 0.5.0

> excessive bulletins and logging when primary node is revoked by NCM.
> --------------------------------------------------------------------
>                 Key: NIFI-1457
>                 URL:
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 0.4.1
>         Environment: centOS 7
>            Reporter: Matthew Clarke
>            Assignee: Matt Gilman
>            Priority: Minor
>             Fix For: 0.5.0
> I have a 3 node cluster up and running. The current primary node loses connectivity to
NCM and eventually becomes disconnected by NCM because of lack of heartbeat.  From NCM, the
disconnected node is dropped from cluster and a new primary node is manually elected. When
network comms are restored between original primary node and NCM, the NCM receives heartbeat
messages once again that claim to be from the primary node. The NCM correctly captures this
and revokes that nodes primary role status. The problem is that the bulletins stating that
the role has been revoked never stop being produced because the original node heartbeats still
claim to be from the primary node.
> On original Primary Node I see these messages constantly:2016-02-01 16:05:39,635 INFO
[Process NCM Request-2] o.a.n.c.p.impl.SocketProtocolListener Received request c5f13d5b-0f09-4fe2-885e-d2d597339491
> On NCM I see these messages constantly in app log:
> 2016-02-01 11:08:25,636 INFO [Process Pending Heartbeats] o.a.n.c.manager.impl.WebClusterManager
Node Event: [id=b996c3c0-996c-4072-ba5a-434294d72036,,
apiPort=8443,, socketPort=10000] -- 'Heartbeat
indicates node is running as primary node.  Revoking primary role because primary role is
assigned to a different node.'
> The original primary Node is no longer running and "on primary node" processors; however,
it appears the heartbeat message is not being updated to reflect that it is not still the
primary node.

This message was sent by Atlassian JIRA

View raw message