nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anil Rai <anilrain...@gmail.com>
Subject Re: Nifi Docker cluster issue
Date Wed, 23 May 2018 17:31:46 GMT
Team, I did not hear back on this.
Here are some additional scenario's.
We have a 1 node cluster. We have a flow. When we run that flow, we get the
below error. After which we see the following message in canvas - Action
cannot be performed because there is currently no Cluster Coordinator
elected. The request should be tried again after a moment, after a Cluster
Coordinator has been automatically elected.

Error in the log : 2018-05-22 11:03:46,272 ERROR [Timer-Driven Process
Thread-1] o.a.n.p.standard.HandleHttpResponse
HandleHttpResponse[id=11c43ac0-b502-11f7-82ec-caedaada5452] Failed to
respond to HTTP request for
StandardFlowFileRecord[uuid=be5f42c4-496b-4086-ad7d-bdedb8c87313,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1526985786635-1032,
container=default, section=8], offset=505044,
length=8],offset=0,name=9546340083575618,size=8] because FlowFile had an
'http.context.identifier' attribute of f2df150e-5c72-4560-a78d-15d8e20f316f
but could not find an HTTP Response Object for this identifier

For the below scenario, the problem was that we imported a template that
had reference to other nifi registery (I know this was fixed recently).
We have a 3 node nifi docker cluster running on a VM. Frequently we see one
of the nodes in the cluster stops sending heartbeat. We do not see any
errors in the logs for that container. But when we check the status of
Nifi, it shows it is running. Since the heart beat stops, the node is
disconnected from the cluster.The canvas is not usable at this point till
we go to the cluster menu and explicitly remove the offending node.
Then if we restart the nifi instance, it comes up and joins the cluster.
We are not sure if this is a docker issue or nifi 1.5 issue. On top, we do
not see any error when the heart beat stops from one of the nodes. Any
suggestions on how to trouble shoot this further?

Sounds like the moment flow file gets corrupted, the cluster becomes
unusable?

Regards
Anil


On Tue, May 15, 2018 at 3:47 PM, Anil Rai <anilrainifi@gmail.com> wrote:

> Team,
>
> We have a 3 node nifi docker cluster running on a VM. Frequently we see
> one of the nodes in the cluster stops sending heartbeat. We do not see any
> errors in the logs for that container. But when we check the status of
> Nifi, it shows it is running. Since the heart beat stops, the node is
> disconnected from the cluster.The canvas is not usable at this point till
> we go to the cluster menu and explicitly remove the offending node.
> Then if we restart the nifi instance, it comes up and joins the cluster.
> We are not sure if this is a docker issue or nifi 1.5 issue. On top, we do
> not see any error when the heart beat stops from one of the nodes. Any
> suggestions on how to trouble shoot this further?
>
> Thanks
> Anil
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message