nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Knell <craig.kn...@gmail.com>
Subject Re: [EXT] Sliding windows
Date Wed, 05 Jun 2019 01:25:58 GMT
Joe

Thanks for the options.  

Kafka Flink option is interesting but to big at the moment. 

I’ll build a custom processor for this occasion.  

Much appreciated!

Craig Knell


> On 4 Jun 2019, at 21:56, Joe Witt <joe.witt@gmail.com> wrote:
> 
> ...from the description it isn't clear what you're trying to achieve so
> lets first try to expand the detail on the use case.
> 
> We should distinguish whether you're wanting to 'combine various objects in
> a data stream together on some time bound' from 'processing various objects
> in a data stream to make some observation over some time bound'.
> 
> If you're wanting to merge data together to make a larger object comprised
> of those smaller objects then MergeContent is your friend.
> 
> If you're wanting to look at a given stream or set of streams at once and
> make a time window based observation over that data then I recommend
> looking at something like Apache Flink which is purpose built for that and
> should be better than NiFi at that part.  If it is a pretty straight
> forward single stream window evaluation and you want to avoid having
> another system in play then I'd just write a little custom processor in
> NiFi for your case.  Once you have a more complex data distribution and
> processing requirement and you want a powerful low latency combination I'd
> say put NiFi, Kafka, and Flink together for a pretty hard to beat combo.
> 
> Thanks
> 
> On Tue, Jun 4, 2019 at 9:51 AM Peter Wicks (pwicks) <pwicks@micron.com>
> wrote:
> 
>> Craig,
>> 
>> If you have a timestamp set as an attribute on the processor, then this is
>> kind of possible.
>> 
>> Have a regular MergeContent processor, with "Maximum Group Size" set to 1
>> mb, set "Max Bin Age" to 3 min; you may need to tweak settings to get the
>> right cadence, but these are generally the settings you need to touch. Use
>> the "Merged" relationship for whatever you need. To create the Window, pass
>> the "Original" relationship to a RouteOnAttribute processor.
>> 
>> In the RouteOnAttribute use NiFi Expression Language to calculate how old
>> the FlowFile is (using the timestamp attribute I mentioned). If the
>> FlowFile is older than x, drop it, else send it back to the MergeContent
>> processor.
>> 
>> Using this process, it should be easy to get a 5 min rolling window (drop
>> any FlowFile older than 5 min in RouteOnAttribute).
>> 
>> I don't know that this perfectly answers what you asked, but does it give
>> you a good direction to investigate?
>> 
>> Thanks,
>>  Peter
>> 
>> -----Original Message-----
>> From: Craig Knell <craig.knell@gmail.com>
>> Sent: Tuesday, June 4, 2019 1:32 AM
>> To: dev@nifi.apache.org
>> Subject: [EXT] Sliding windows
>> 
>> Hi Folks
>> 
>> We have a stream of data that I need to window to 5 minutes and the window
>> is to slide every 3 minutes. Each minute is 1 mb, I therefore have to
>> deliver 5mb per 3 minutes.
>> 
>> What is the best way of achieving this in nifi?
>> 
>> Best regards
>> 
>> Craig
>> 

Mime
View raw message