nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michal Klempa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-1562) ExecuteStreamCommand and ExecuteProcess do not support empty command line arguments
Date Thu, 25 Feb 2016 08:34:18 GMT

    [ https://issues.apache.org/jira/browse/NIFI-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166918#comment-15166918
] 

Michal Klempa commented on NIFI-1562:
-------------------------------------

Pull request here: https://github.com/apache/nifi/pull/247

> ExecuteStreamCommand and ExecuteProcess do not support empty command line arguments
> -----------------------------------------------------------------------------------
>
>                 Key: NIFI-1562
>                 URL: https://issues.apache.org/jira/browse/NIFI-1562
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 0.5.0, 0.4.1
>            Reporter: Michal Klempa
>
> Argument splitting is cluttered with trimming the whitespaces around the whole argument
list and also for each individual argument.
> This causes wrong behavior when DataFlow Manager needs to put empty string as an argument
for command using ExecuteStreamCommand and ExecuteProcess.
> Lets start by what DataFlow Manager needs to achieve (steps to reproduce):
> 1. Create a file "test.tsv" with *TAB* separated content:
> {code}
> one	two	three
> this	is	one	string
> {code}
> 2. Put GetFile Prrocessor to obtain this file into DataFlow
> 3. Connect GetFile to ExecuteStreamCommand.
> 4. ExecuteStreamCommand configuration: 
>  - Command Path: cut
>  - Command Arguments: {code}-f;1,2,3,4;--output-delimiter;{code}
>  - auto terminate: original
> 5. Put LogAttribute (Log Payload: true, autoterminate: success) and connect ExecuteStreamCommand
to LogAttribute to see the output.
> 6. Run this Flow.
> Expected output:
> {code}
> onetwothree
> thisisonestring
> {code}
> As the --output-delimiter argument to cut command is empty string (notice the last semicolon
in argument list), cut command effectively joins columns.
> This output can be obtained by issuing this command from within bash:
> {code}
> $ cut -f 1,2,3,4  --output-delimiter '' test.csv
> {code}
> Those are apostrophes (to tell bash it is an empty argument).
> Actual output:
> ExecuteStreamCommand informs Bulletin of cut command error:
> {code}
> 06:14:27 UTC
> ERROR
> fb12bb69-37e0-4e23-927c-a8aba40f360d
> ExecuteStreamCommand[id=fb12bb69-37e0-4e23-927c-a8aba40f360d] Transferring flow file
StandardFlowFileRecord[uuid=d94c9e62-1005-4a2d-815d-bdb4c02ebd85,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1456380578601-1, container=default, section=1], offset=231,
length=0],offset=0,name=test.tsv,size=0] to output stream. Executable command cut ended in
an error: cut: option '--output-delimiter' requires an argument
> Try 'cut --help' for more information.
> {code}
> This is due {{org.apache.nifi.processors.standard.util.ArgumentUtils}}:
> 1. Line 41: unwanted string trimming - imagine we have used {{' '}} (spacebar) as argument
separator in previous example, then property would look like this: Command Arguments:
> {code}
> "-f 1,2,3,4 --output-delimiter "
> {code}
> (there is a space at the end of the string - the last separator as it was with semicolon).
Then, trimming on this line, would ruin our last argument even before we come to splitting
the argument string to list.
> 2. Line 52: if our output delimiter would look like {{" = "}} (space equals space), for
example to create some kind of .ini file, this trimming would kill our attempts by providing
the cut command only the {{"="}} as argument.
> 3. Line 53: if our attempt is to provide cut command with empty string as argument (to
join columns), we are neglected by this line.
> There is a also JUnit test {{org.apache.nifi.processors.standard.TestExecuteProcess:testSplitArgs}}
which just tests this wrong behavior.
> 4. Lines 69, 71- trimming once again.
> And as I am trying to fix this bug, I do see that there is also obscure QUOTE system,
which, is not for quoting the delimiter character (which would otherwise be treated as a delimiter),
but QUOTES are remove also when they do not enclose the delimiter. This quoting should be
rethinked and documented. Lets fix at least this first bug reported here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message