nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Witt <joe.w...@gmail.com>
Subject Re: Generate flowfiles from flowfile content
Date Wed, 23 Sep 2015 23:00:01 GMT
David,

I think if i read your case correctly this should be supported really
well.  The flow would be something like:

GetSQS -> SplitJson -> EvaluateJsonPath -> FetchS3Object

In SplitJSON you'll break apart the original object into smaller valid
JSON objects.

In evaluate JsonPath you'll promote the filename/url you need from the
JSON content to flow file attributes

In FetchS3 you'll go grab the item based on the name/url you pulled in
evaluate json path.

Bryan: Any chance you could put together a quick template for David to
check out?

Thanks
Joe

On Wed, Sep 23, 2015 at 3:41 PM, David Klim <davidklmlg@hotmail.com> wrote:
> Hello Bryan,
>
> I should have been more specific. What I am trying to do is to fetch files
> from S3. I am using the GetSQS processor to get new object (files) events,
> and each event is a json containing the list of new objects (files) in the
> bucket. The output of the GetSQS is processed by SplitJson and I get
> flowfiles containing one object key (filename) each. I need to feed this
> into FetchS3Object to retrive the actual file, but FetchS3Object expects the
> flowfile filename attribute (or any other) to be the filename. So I guess
> the problem is moving the filename string from the flowfile content to some
> attribute.
>
> If there is no other alternative, I will implement this processor.
>
> Thanks!
>
> ________________________________
> From: rbraddy@softnas.com
> To: users@nifi.apache.org
> Subject: RE: Generate flowfiles from flowfile content
> Date: Wed, 23 Sep 2015 19:59:21 +0000
>
>
> Good idea, Adam.
>
>
>
> I will post a separate review thread on the dev@ list to track comments.
>
>
>
> Here’s the repository link:  https://github.com/rickbraddy/nifishare
>
>
>
>
>
> Thanks
>
> Rick
>
>
>
> From: Adam Taft [mailto:adam@adamtaft.com]
> Sent: Wednesday, September 23, 2015 1:48 PM
> To: users@nifi.apache.org
> Subject: Re: Generate flowfiles from flowfile content
>
>
>
> Not speaking for the entire community, but I am sure that such a
> contribution would (at minimum) be appreciated for review, consideration and
> potential inclusion.  The best thing would be ideally hosting the source
> code somewhere that the rest of the community could go to for review.  Maybe
> you could host the GetFileData and PutFileData processors on a GitHub
> repository somewhere?
>
> I think the idea you proposed is good, but might need to be aligned with the
> work (if any) for the referenced ListFile and FetchFile implementation.  And
> the differences in your PutFileData vs. PutFile would ideally be well vetted
> as well.
>
> Adam
>
>
>
>
>
>
>
> On Wed, Sep 23, 2015 at 2:23 PM, Rick Braddy <rbraddy@softnas.com> wrote:
>
> We have already developed modified a modified GetFIle called GetFileData
> that takes an incoming FlowFile containing the path to the file/directory
> that needs to be transferred.  There is a corresponding PutFileData on the
> other side that accepts the incoming file/directory that creates the
> directory/tree as needed or writes the file, then sets the permissions and
> ownership.  GetFileData also receives a file.rootdir attribute that gets
> passed along to PutFileData, so it can rebase the original file’s location
> relative to the configured target directory.  Unlike GetFile/PutFile, these
> processor work with entire directory trees and are triggered by incoming
> FlowFiles to GetFileData.
>
>
>
> Eventually, we want to further enhance these two processors so they can
> break large files into “chunks” and send as multi-part files that get
> reassembled by PutFileData, resolving the limitations associated with huge
> files and content repository size; e.g., there are default 100MB chunk
> threshold and 10MB chunk size properties that will control the chunking, if
> enabled.
>
>
>
> If the community is interested would benefit from these processors, we’re
> happy to consider further generalizing and contributing these processors,
> along with any further refinements based upon community review and feedback.
>
>
>
> I believe these processors would address both the Jira and David’s original
> inquiry.
>
>
>
> Rick
>
>
>
> From: Adam Taft [mailto:adam@adamtaft.com]
> Sent: Wednesday, September 23, 2015 1:09 PM
> To: users@nifi.apache.org
> Subject: Re: Generate flowfiles from flowfile content
>
>
>
> Right.  This would be the use case that FetchFile [1] would help solve.
>
> [1] https://issues.apache.org/jira/browse/NIFI-631
>
>
>
> On Wed, Sep 23, 2015 at 1:11 PM, Bryan Bende <bbende@gmail.com> wrote:
>
> Hi David,
>
>
>
> When you say "files I need to retrieve", are you referring to files on the
> local filesystem where NiFi is running?
>
>
>
> If so, I am not aware of an existing processor that does that. Currently we
> have GetFile which polls a directory, but that is not what you want here.
>
>
>
> It would be fairly straight forward to implement with a custom processor
> though... You would read the incoming FlowFile content to get the filename,
> then create a new FlowFile with your desired name, and write the content of
> the local file to the new FlowFile.
>
>
>
> -Bryan
>
>
>
>
>
> On Wed, Sep 23, 2015 at 11:16 AM, David Klim <davidklmlg@hotmail.com> wrote:
>
> Hello,
>
>
>
> In a flow I am defining, I receive a flowfile containing json string. Using
> the splitJson processor I can extract some json paths pointing to some files
> I need to retrieve, but the filename is the content of the generated
> flowfile. So I would need to be able to read the content and generate a
> flowfile with that name instead. How could I do that?
>
>
>
> Thanks!
>
>
>
>
>
>
>
>

Mime
View raw message