nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe Skora (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-631) Create ListFile and FetchFile processors
Date Thu, 17 Sep 2015 04:34:45 GMT

    [ https://issues.apache.org/jira/browse/NIFI-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14791570#comment-14791570
] 

Joe Skora commented on NIFI-631:
--------------------------------

[~markap14] - yes I'm still working on it.  The caching in the HDFS versions depend on Hadoop
libraries for filesystem info that I'm working to replicate.

> Create ListFile and FetchFile processors
> ----------------------------------------
>
>                 Key: NIFI-631
>                 URL: https://issues.apache.org/jira/browse/NIFI-631
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Mark Payne
>
> This pair of Processors will provide several benefits over the existing GetFile processor:
> 1. Currently, GetFile will continually pull the same files if the "Keep Source File"
property is set to true. There is no way to pull the file and leave it in the directory without
continually pulling the same file. We could implement state here, but it would either be a
huge amount of state to remember everything pulled or it would have to always pull the oldest
file first so that we can maintain just the Last Modified Date of the last file pulled plus
all files with the same Last Modified Date that have already been pulled.
> 2. If pulling from a network attached storage such as NFS, this would allow a single
processor to run ListFiles and then distribute those FlowFiles to the cluster so that the
cluster can share the work of pulling the data.
> 3. There are use cases when we may want to pull a specific file (for example, in conjunction
with ProcessHttpRequest/ProcessHttpResponse) rather than just pull all files in a directory.
GetFile does not support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message