nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierre Villard <pierre.villard...@gmail.com>
Subject Re: Restarting NiFi causing SiteToSiteBulletinReportingTask to fail
Date Wed, 18 Apr 2018 09:36:59 GMT
I created to https://issues.apache.org/jira/browse/NIFI-5092 to track the
issue. Will submit a fix really soon.
Current workaround: after a NiFi restart, stop the reporting task, clear
the state of the reporting task and start the reporting task.

Pierre

2018-04-18 0:04 GMT+02:00 Pierre Villard <pierre.villard.fr@gmail.com>:

> Hi Chad,
>
> I confirm that I can reproduce the issue on my side with a NiFi 1.5.0
> cluster and I don't see anything that would fix it in NiFi 1.6.0.
>
> I had a closer look and it does not seem related to the Site-to-Site
> mechanism: the thread in charge of refreshing the peers is correctly
> running and you should see logs like "Successfully refreshed Peer Status;
> remote instance consists of X peers".
>
> As far as I can see, it sounds related to how we are caching the ID of the
> last bulletin sent and how we retrieve this value to "restart" the task
> after the NiFi node restarted. That's why you have to delete the task and
> create it again: it'll delete the associated cache.
>
> That's just an assumption after a quick look, I'll keep digging tomorrow
> and open a JIRA for that.
>
> Thanks for reporting it!
>
> Pierre
>
>
> 2018-04-12 23:41 GMT+02:00 Pierre Villard <pierre.villard.fr@gmail.com>:
>
>> Hi Chad,
>>
>> I believe this could have been fixed recently but I've very limited
>> access right now (and for the next few days) and can't be sure...
>> I will check next week if no one gave you feedbacks before.
>>
>> Pierre
>>
>> 2018-04-12 19:57 GMT+02:00 Woodhead, Chad <Chad.Woodhead@ncr.com>:
>>
>>> I am running HDF 3.0.1.1 which comes with NiFi 1.2.0.3.0.1.1-5. We are
>>> using SiteToSiteBulletinReportingTask to monitor bulletins (for things
>>> like Disk Usage and Memory Usage). When we restart NiFi via Ambari (either
>>> with a Restart or Stop and then Start), when NiFi comes back up the
>>> SiteToSiteBulletinReportingTask no longer works. It throws the
>>> following error when it is first trying to start up:
>>>
>>> SiteToSiteBulletinReportingTask[id=ba6b4499-0162-1000-0000-00003ccd7573]
>>> org.apache.nifi.remote.client.PeerSelector@34e976af Unable to refresh
>>> Remote Group's peers due to response code 409:Conflict with explanation:
>>> null
>>>
>>> No matter how long we wait, it never works. The ways I have been able to
>>> get it to start working again are as follows:
>>>
>>>   *   Stop and then Start the Remote Input Port the
>>> SiteToSiteBulletinReportingTask is using
>>>   *   Delete the SiteToSiteBulletinReportingTask and create a new one
>>>   *   Wait a while and stop and start the SiteToSiteBulletinReportingTask
>>> (however this doesn't work consistently)
>>>
>>> I have tested the same flow steps using a process that uses a Remote
>>> Process Group and a different Remote Input Port, and that RPG throws the
>>> same error when first coming up but then starts working after a period of
>>> time. So maybe the SiteToSiteBulletinReportingTask isn't trying enough
>>> times to connect to the Remote Input Port?
>>>
>>> Sincerely,
>>> Chad Woodhead
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message