nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oscar dela Pena <odelap...@exist.com>
Subject Re: NiFi 0.4.1 Very slow processing of flow files using PutKafka
Date Tue, 05 Apr 2016 00:47:31 GMT
Hi, 

Thanks for the replies. Your recommendation is to use 0.4.1 Kafka nar, correct? Do we need
to upgrade NiFi from 0.4.0 to 0.4.1? Or just a nar upgrade should be sufficient? 
Does this also mean that 0.4.0 isn't a stable version to use as Kafka producer on high data
rate? 

Thanks, 
Oscar 


----- Original Message -----

From: "Tim Reardon" <tequalsme@gmail.com> 
To: dev@nifi.apache.org 
Cc: dev@nifi.incubator.apache.org 
Sent: Monday, April 4, 2016 8:26:23 PM 
Subject: Re: NiFi 0.4.1 Very slow processing of flow files using PutKafka 

I wouldn't advise upgrading to 0.6.0 to address PutKafka issues, as there 
is an outstanding bug (NIFI-1701) that truncates messages. 
NiFi 0.5.1 with the 0.4.1 Kafka nar is at least functional. 

On Mon, Apr 4, 2016 at 6:48 AM, Oleg Zhurakousky < 
ozhurakousky@hortonworks.com> wrote: 

> Oscar 
> 
> Would you mind upgrading to NiFi 0.6.0? There were significant 
> improvements to Kafka module 
> 
> Thanks 
> Oleg 
> 
> On Apr 4, 2016, at 04:21, Oscar dela Pena <odelapena@exist.com> wrote: 
> 
> Hi NiFi team, 
> 
> This is our current NiFi flow: 
> Our Kafka is version 0.8.2 and NiFi is version 0.4.0. The two versions 
> should be a match according to source codes and here. 
> <https://cwiki.apache.org/confluence/display/NIFI/Release+Notes> 
> However, PutKafka is extremely slow when processing queued flow files, 
> coming at 40GB/hour rate. 
> We had to add a dynamic property *block.on.buffer.full = true* to get rid 
> of "BufferExhaustedException", and set the buffer size to 4GB. 
> 
> Flow files are plain text files and are delimited by \n(new line). 
> File delimiter is set to \n. 
> Everything else are default processor values. 
> 
> Previous version of Kafka and NiFi doesn't have this problem. We have 
> another running Kafka instance to prove it. 
> We used PutKafka v0.3.0. We built it from source and renamed the 
> processor. It sends to Kafka 0.8.1 the same messages as PutKafka v0.4.0. 
> Same delimiter is used. All messages are sent and no problems. More 
> details on the configurations are posted below. 
> 
> <PutKafka040.jpg> 
> 
> 
> 
> *PutKafka 0.3.0* PutKafka 0.4.0 
> *Kafka Version* 0.8.1 
> 0.8.2 
> *Errors* No error. All messages are sent to Kafka. 
> Without* block.on.buffer.full = true*, BufferExhaustedException occurs. 
> I added block.on.buffer.full = as a dynamic property and 
> BufferExhaustedException is gone. 
> *Sending records is really slow. * 
> 120GB worth of data get stuck in queue when running the flow for 2hrs. 
> 
> <Selection_603.png> 
> <Selection_605.png> 
> 
> 
> How should we configure our PutKafka so that it can have the same 
> performance as the old PutKafka(0.3.0) 
> 
> Thanks, 
> Oscar 
> 
> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message