nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Wagner <swag...@beenverified.com>
Subject Re: How to gracefully handle a circular graph?
Date Tue, 28 Feb 2017 15:32:32 GMT
Matt,

     Thanks for the suggestion.  I did end up creating a ExecuteScript 
processor that expanded out the range of the numbers into the content, 
and it is working for this scenario.

Thanks again!

- Scott

> Matt Foley <mailto:mattf@apache.org>
> Monday, February 27, 2017 5:21 PM
>
> If I understand correctly, your desired goal is for each input row 
> that specifies a range, A to A+N, you would generate a sequence of N 
> (or perhaps N+1) flowfiles, right?  And the only difference in each 
> flowfile is that you’ve Replaced the range specification with a single 
> number from that range?
>
> I would suggest that at the level of the row input, you use 
> ExecuteScript to expand each input row into N rows, with the 
> substituted number values, then run that through SplitText, to get one 
> row per flowfile.  This should be way more efficient, as well as much 
> safer than a cyclic graph.
>
> Cheers,
>
> --Matt
>
> *From: *Scott Wagner <swagner@beenverified.com>
> *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org>
> *Date: *Monday, February 27, 2017 at 2:34 PM
> *To: *"users@nifi.apache.org" <users@nifi.apache.org>
> *Subject: *How to gracefully handle a circular graph?
>
> Hello all,
>
>     I have created a graph where I am downloading a number of rows 
> from an SQL database, and each row defines a range of numbers 
> (100-200, 700-1500, etc.).  What I am then doing on the NiFi side is 
> generating an individual FlowFile for each number in that range.  The 
> way that I was accomplishing this was by setting attributes to the 
> "current" value to the lower boundary, and an attribute of the upper 
> boundary, and then creating two queues off the "success" output for a 
> Processor (the ReplaceText processor in the bottom right of the 
> image), one of which goes on to process that number's record (going 
> off the bottom right in the picture), and the other one of which goes 
> off to a processor to increment the "current" number, and will then 
> forward it to the processor that will check to make sure that 
> "current" is less than or equal to "upper boundary".
>
>     This works great, until the queues end up filling up.  Once this 
> happens, I have a gridlock situation where none of the processors in 
> this triangle are running any longer, because they all have a full 
> output queue.  I have tried searching the Internet and put a little 
> thought into how I could make it so that my "Check if done" processor 
> would prefer entries that are coming in from the circular portion of 
> the graph, but so far haven't been able to come up with anything.  
> What I have tried is making both of the input queues to "Check if 
> done" go through a funnel, and set an Oldest FlowFile prioritizer, but 
> it still eventually ends up filling up the entire triangle of queues.
>
>
>
>     Does anyone have a suggestion as to how I could gracefully handle 
> a situation like this?  I appreciate any advice.
>
> Thanks!
>
> - Scott Wagner
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
> 	
>
> Virus-free. www.avg.com 
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>

>
>
> Scott Wagner <mailto:swagner@beenverified.com>
> Monday, February 27, 2017 4:34 PM
> Hello all,
>
>     I have created a graph where I am downloading a number of rows 
> from an SQL database, and each row defines a range of numbers 
> (100-200, 700-1500, etc.).  What I am then doing on the NiFi side is 
> generating an individual FlowFile for each number in that range.  The 
> way that I was accomplishing this was by setting attributes to the 
> "current" value to the lower boundary, and an attribute of the upper 
> boundary, and then creating two queues off the "success" output for a 
> Processor (the ReplaceText processor in the bottom right of the 
> image), one of which goes on to process that number's record (going 
> off the bottom right in the picture), and the other one of which goes 
> off to a processor to increment the "current" number, and will then 
> forward it to the processor that will check to make sure that 
> "current" is less than or equal to "upper boundary".
>
>     This works great, until the queues end up filling up.  Once this 
> happens, I have a gridlock situation where none of the processors in 
> this triangle are running any longer, because they all have a full 
> output queue.  I have tried searching the Internet and put a little 
> thought into how I could make it so that my "Check if done" processor 
> would prefer entries that are coming in from the circular portion of 
> the graph, but so far haven't been able to come up with anything.  
> What I have tried is making both of the input queues to "Check if 
> done" go through a funnel, and set an Oldest FlowFile prioritizer, but 
> it still eventually ends up filling up the entire triangle of queues.
>
>
>
>     Does anyone have a suggestion as to how I could gracefully handle 
> a situation like this?  I appreciate any advice.
>
> Thanks!
>
> - Scott Wagner
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>

> 	Virus-free. www.avg.com 
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>

>
>


Mime
View raw message