nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Witt (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (NIFI-512) Allow GetFile to pull in data without deleting the local file
Date Wed, 03 Jun 2015 15:13:37 GMT

     [ https://issues.apache.org/jira/browse/NIFI-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joseph Witt resolved NIFI-512.
------------------------------
    Resolution: Duplicate

> Allow GetFile to pull in data without deleting the local file
> -------------------------------------------------------------
>
>                 Key: NIFI-512
>                 URL: https://issues.apache.org/jira/browse/NIFI-512
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>
> There have been several people asking for this capability. Currently, when we do a file
listing, it's placed into a HashSet, so there is no ordering for how we pull the files in.
My proposal is that we instead order the files such that we pull the oldest file first and
keep track of the latest timestamp that we've pulled in. This way on restart we can resume
where we left off.
> I would create a FileOutputStream and keep it open. Write out the timestamp each time
we pull data in. Then periodically flush the data to disk. Perhaps every second or so - maybe
this should be configurable. We need a tradeoff between how much possible duplication we get
and how much time we spend persisting the timestamp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message