nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tony Kurc (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-1081) Add option to ExecuteStreamCommand to put value of execution to an attribute
Date Sat, 14 Nov 2015 14:17:10 GMT

    [ https://issues.apache.org/jira/browse/NIFI-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005412#comment-15005412
] 

Tony Kurc commented on NIFI-1081:
---------------------------------

https://issues.apache.org/jira/browse/NIFI-1081?focusedCommentId=15001211&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15001211

This is an example of "I wonder [~markap14] was looking at to make him say this, because it
looks to me like maybe it wasn't fixed"

I have also seen tickets with a diarrhea of patches that I had to wade through and try to
figure out to apply, which is also a time waste, as Mark mentioned.

Are deleted patches memorialized somewhere that I don't know about?

I believe the approach Apache Yetus prescribes solves both these problems as well as makes
a scenario of doing some preanalysis possible (imagine a magical world where a patch is autochecked
against our checkstyle, ensures it builds cleanly, and runs rat before it is even reviewed
(NIFI-577)). Also, should we have moved this discussion to the dev list?

> Add option to ExecuteStreamCommand to put value of execution to an attribute
> ----------------------------------------------------------------------------
>
>                 Key: NIFI-1081
>                 URL: https://issues.apache.org/jira/browse/NIFI-1081
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Joseph Percivall
>            Assignee: Joseph Percivall
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: ExecuteStreamCommandTester.xml, NIFI-1081_fix_property_name.patch
>
>
> This issue arose from a user on the mailing list. It demonstrates the need to be able
to put the output of ExecuteStreamCommand to an attribute:
> I'm looking to process many files into common formats.  The source files are coming in
various character sets, mime types, and new line terminators.
> My thinking for a data flow was along these lines:
> GetFile (from many sub directories) -> 
> ExecuteStreamCommand (file -i) ->
> ConvertCharacterSet (from previous command to utf8) ->
> ReplaceText (to change any \r\n into \n) ->
> PutFile (into a directory structure based on values found in the original file path and
filename)
> Additional steps would be added for archiving a copy of the original, converting xml
files, etc.
> Attempting to process these with Nifi leaves me confused as to how to process within
the tool.  If I want to ConvertCharacterSet, I have to know the input type.  I setup a ExecuteStreamCommand
to file -i ${absolute.path:append(${filename})} which returned the expected values.  I don't
see a way to turn these results into input for the processor, which doesn't accept expression
language for that field.
> I also considered ConvertCSVToAvro as an interim step but notice the same issue.  Any
suggestions what this dataflow should look like?
> Bryan Bende's response:
> One problem with the above flow is that ExecuteStreamCommand will replace the contents
of the FlowFile with the results of the command, so the FlowFIle will have the encoding value
and no longer have the original content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message