nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Irizarry Jr., Nazario" <...@mitre.org>
Subject Re: [DISCUSS] Run Once scheduling
Date Thu, 12 Jan 2017 19:04:34 GMT
The users that I work with are in the data-analytic space.  Without NiFi what they tend to
do is to build scripts, often shell scripts that they edit and modify from run to run.  Thus
for this class of user it is not really about continuous flows.  But, the connect-the-box
metaphor is nonetheless a very good way to automate what they do and get them away from lots
of script editing.

Thus for that type of application and those types of users when one builds a script each step
has either been executed or it is going to be executed.  (for discussions sake lets ignore
the conditional execution).  That is why the ability to create a flow in which items have
or have not run is a natural way to migrate from scripts to flows.

Naz Irizarry
MITRE Corp.
617-893-0074



> On Jan 12, 2017, at 1:16 PM, Oleg Zhurakousky <ozhurakousky@hortonworks.com> wrote:
> 
> I was just about to suggest the same. 
> Run-once would be a bit counter intuitive to the flow processing as a concept. Basically
think of it this way; Flow or parts of it have only two states - RUNNING or STOPPED. In the
RUNNING state it processes the data as it arrives (every second, every minute or every day
etc). Indeed there may be a concern that the processor will do a lot of 'dry’ spins if no
data is available but fortunately NiFi allows you to limit the impact of that by configuring
“yield duration’. By default it is set to 1 sec, but for your case you may wan to set
it to 1 hour or so essentially controlling the scheduling of such processor between ‘dry’
spins.
> 
> That said and just to entertain the idea of Run Once, what do you think should be the
processor state if it did ran once? Let’s assume it did and somehow was stopped. . . then
what? The data arrived on the incoming queue, but nothing is processed until someone manually
goes and re-starts the processor. Right?
> I mean from the general workflow standpoint the concern is very valid, but from flow
processing the fact that NiFi does not support it is actually more of a feature rather then
lack of functionality.
> 
> Thoughts?
> 
> Cheers
> Oleg
> 
>> On Jan 12, 2017, at 1:02 PM, Joe Witt <joe.witt@gmail.com> wrote:
>> 
>> Naz,
>> 
>> Why not just leave all the processes running?  If the data only
>> arrives periodically that is ok, right?
>> 
>> Thanks
>> Joe
>> 
>> On Thu, Jan 12, 2017 at 10:54 AM, Irizarry Jr., Nazario <naz@mitre.org> wrote:
>>> On a project that I am on we have been looking at using NiFi for orchestrations
that are invoked infrequently.  For example, once a month a new data input product becomes
available and then one wants to run it through a set of processing steps that can be nicely
implemented using NiFi processors.  However, using the interval or cron scheduling for this
purpose begins to get cumbersome after a while with the need to start and manually stop these
occasional flows.
>>> 
>>> It would be fairly easy to add an additional scheduling option - “Run Once”
for this use case.  The behavior would be that when a processor is set to run once it automatically
stops after it has successfully processed one input.
>>> 
>>> What do people think?  We are willing to implement this small enhancement.
>>> 
>>> Cheers,
>>> 
>>> Naz Irizarry
>>> MITRE Corp.
>>> 617-893-0074
>>> 
>>> 
>>> 
>> 
> 

Mime
View raw message