nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Farbota <jfarb...@payoff.com>
Subject Re: issues with Kafka ConsumeKafka_0_10 (1.0.0)
Date Fri, 23 Jun 2017 20:10:08 GMT
Haha, well put. Switching from .6 to 1.0.0 was quite a mountain for my dev
ops guy who was completely new to NiFi. I expect this next upgrade to be a
lot smoother.

Thanks for the tip btw. It was definitely a back pressure issue. The weird
thing is the back pressure seemed to be shared within the parent processor
group. I have all my Kafka consumers in on processor group with each topic
getting their own processor group. Each consumer has two connections
(outputs). One is for the immediate ingest and the other dumps raw records
to hdfs after a merge (which requires some time or size threshold be met).
So that merge one ends up with a lot of objects queued up.

There were <5k in each one and the threshold was set to 10K for each
however the backpressure was triggered in all of the consumer processors.
Even some with only 3 objects in the connection had backpressure (I
assume). Anyhow, I'll see how the behavior is in the new version and reply
if I have this issue again. Wanted to report back with a solution in case
others have this issue on the older version.

Thanks again. Now I can rest easy over the weekend without any red in the
back of my head. :)

[image: Payoff, Inc.]
*Jeremy Farbota*
Software Engineer, Data
Payoff, Inc.

jfarbota@payoff.com
(217) 898-8110 <+2178988110>

On Fri, Jun 23, 2017 at 12:46 PM, Joe Witt <joe.witt@gmail.com> wrote:

> I get it.  If there are curveballs with upgrades that you run into and
> ideas you have to make it easier please let us know.  We are moving toward
> the extension registry concept which will help a ton in that regard and
> we've done a lot with backward/forward compatible bits but upgrades can
> still make some people feel like they're cliff diving into unknown waters
> without the fun adrenaline rush.
>
> On Fri, Jun 23, 2017 at 3:43 PM, Jeremy Farbota <jfarbota@payoff.com>
> wrote:
>
>> I see. That makes some sense.
>>
>> I've been reading about the improvements and it is good to reaffirm the
>> new version will likely help us out a lot. We do not plan on getting this
>> far behind again.
>>
>> [image: Payoff, Inc.]
>> *Jeremy Farbota*
>> Software Engineer, Data
>> Payoff, Inc.
>>
>> jfarbota@payoff.com
>> (217) 898-8110 <+2178988110>
>>
>> On Fri, Jun 23, 2017 at 12:40 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>
>>> Jeremy
>>>
>>> It is possible that backpressure was being engaged in NiFi and causing
>>> our consumer code to handle it poorly.  We did fix that a while ago and I
>>> think it ended up in NiFi 1.2.0 (off top of mind anyway).  Between your
>>> current release and the latest 1.3.0 release a few bugs with those
>>> processors have been fixed which are quite useful and we've added ones
>>> which allow you to consume and publish record objects which if you've read
>>> about the record reader/writer stuff at all I bet you'll find really
>>> helpful for your flows.
>>>
>>> Thanks
>>>
>>> On Fri, Jun 23, 2017 at 3:31 PM, Jeremy Farbota <jfarbota@payoff.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm having issues today with my ConsumeKafka_0_10 processors (Kafka is
>>>> 0.10.1 (3 nodes) NiFi is 1.0.0 3 node cluster). They are all throwing this
>>>> error seemingly with each new batch (see attached). We are not seeing
>>>> errors on other client consumers (clojure, spark).
>>>>
>>>> My questions are:
>>>>
>>>> 1) Does this error indicate that some offsets might not be getting
>>>> consumed or does the consumer restart and re-read the offset when the
>>>> problem occurred? Can I safely ignore this error for the time being since
>>>> message seem to keep coming through regardless?
>>>>
>>>> 2) I reduced the max.poll.records to 10 and I'm still getting this
>>>> error. I also increased the heap and restarted the service on each node.
I
>>>> got this error shortly after I clicked to look at a provenance event on a
>>>> processor. I've had an issue in the past where I clicked to look at a
>>>> provenance event and one node when down from bufferOverload. Is it possible
>>>> that there is some connection between this error and some background
>>>> provenance process that I can kill? Is Could this be a memory issue? Is
>>>> this a known bug with the consumer?
>>>>
>>>> We're upgrading to 1.3.0 next week. Is it possible that the upgrade
>>>> will fix this issue with ConsumeKafka_0_10?
>>>>
>>>>
>>>> [image: Payoff, Inc.]
>>>> *Jeremy Farbota*
>>>> Software Engineer, Data
>>>> Payoff, Inc.
>>>>
>>>> jfarbota@payoff.com
>>>>
>>>
>>>
>>
>

Mime
View raw message