nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Foley <ma...@apache.org>
Subject Re: How to gracefully handle a circular graph?
Date Mon, 27 Feb 2017 23:21:49 GMT
If I understand correctly, your desired goal is for each input row that specifies a range,
A to A+N, you would generate a sequence of N (or perhaps N+1) flowfiles, right?  And the
only difference in each flowfile is that you’ve Replaced the range specification with a
single number from that range?

 

I would suggest that at the level of the row input, you use ExecuteScript to expand each input
row into N rows, with the substituted number values, then run that through SplitText, to get
one row per flowfile.  This should be way more efficient, as well as much safer than a cyclic
graph.

 

Cheers,

--Matt

 

From: Scott Wagner <swagner@beenverified.com>
Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
Date: Monday, February 27, 2017 at 2:34 PM
To: "users@nifi.apache.org" <users@nifi.apache.org>
Subject: How to gracefully handle a circular graph?

 

Hello all,

    I have created a graph where I am downloading a number of rows from an SQL database, and
each row defines a range of numbers (100-200, 700-1500, etc.).  What I am then doing on the
NiFi side is generating an individual FlowFile for each number in that range.  The way that
I was accomplishing this was by setting attributes to the "current" value to the lower boundary,
and an attribute of the upper boundary, and then creating two queues off the "success" output
for a Processor (the ReplaceText processor in the bottom right of the image), one of which
goes on to process that number's record (going off the bottom right in the picture), and the
other one of which goes off to a processor to increment the "current" number, and will then
forward it to the processor that will check to make sure that "current" is less than or equal
to "upper boundary".

    This works great, until the queues end up filling up.  Once this happens, I have a gridlock
situation where none of the processors in this triangle are running any longer, because they
all have a full output queue.  I have tried searching the Internet and put a little thought
into how I could make it so that my "Check if done" processor would prefer entries that are
coming in from the circular portion of the graph, but so far haven't been able to come up
with anything.  What I have tried is making both of the input queues to "Check if done" go
through a funnel, and set an Oldest FlowFile prioritizer, but it still eventually ends up
filling up the entire triangle of queues.



    Does anyone have a suggestion as to how I could gracefully handle a situation like this?
 I appreciate any advice.

Thanks!

- Scott Wagner

Virus-free. www.avg.com 

 


Mime
View raw message