nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: ListSFTP incoming relationship
Date Tue, 27 Mar 2018 04:08:34 GMT
Scott

This idea has come up a couple of times and there is definitely
something intriguing to it.  Where I think this idea stalls out though
is in implementation.

While I agree that the other List* processors might similarly benefit
lets focus on ListFile.  Today you tell ListFile what directory to
start looking for files in.  It goes off scanning that directory for
hits and stores state about what it has already searched/seen.  And it
is important to keep track of how much it has already scanned because
at times the search directory can be massive (100,000s of thousands or
more files and directories to scan for example).

In the proposed model the directory to be scanned could be provided
dynamically by looking at an attribute of an incoming flowfile (or
other criteria can be provided - not just the directory to scan).  In
this case the ListFile processor goes on scanning against that now.
What about the previous directory (or directories) it was told to
scan?  Does it still track those too?  What if it starts scanning the
newly provided directory, hasn't finished pulling all the data or new
data is continually arriving, and it is told to switch to another
directory.

I think if those questions can get solid answers and someone invests
time in creating a PR then this could be pretty powerful.  Would be
good to see a written description of the use case(s) for this too.

Thanks
Joe

On Mon, Mar 26, 2018 at 11:58 PM, scott <tcots8888@gmail.com> wrote:
> Hello Devs,
>
> I would like to request a feature to a major processor, ListSFTP. But before
> I do down the official road, I wanted to ask if anyone thought it was a
> terrible idea or impossible, etc. The request is to add support for an
> incoming relationship to the ListSFTP processor specifically, but I could
> see it added to many of the commonly used head processes, such as ListFile.
> I would envision functionality more like InvokeHTTP or ExecuteSQL, where an
> incoming flow file could initiate the action, and the attributes in the
> incoming flow file could be used to configure the processor actions. It's
> the configuration aspect that most appeals to me, because it opens it up to
> being centrally or dynamically configured.
>
> Thanks,
>
> Scott
>

Mime
View raw message