nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-1856) ExecuteStreamCommand Needs to Consume Standard Error
Date Tue, 14 Feb 2017 22:23:42 GMT

    [ https://issues.apache.org/jira/browse/NIFI-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866826#comment-15866826
] 

ASF GitHub Bot commented on NIFI-1856:
--------------------------------------

Github user brosander commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1364#discussion_r101158900
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExecuteStreamCommand.java
---
    @@ -287,6 +311,32 @@ protected PropertyDescriptor getSupportedDynamicPropertyDescriptor(final
String
             .build();
         }
     
    +    @OnScheduled
    +    public void setupExecutor(final ProcessContext context) {
    +        executor = Executors.newFixedThreadPool(context.getMaxConcurrentTasks() * 2,
new ThreadFactory() {
    +            private final ThreadFactory defaultFactory = Executors.defaultThreadFactory();
    +
    +            @Override
    +            public Thread newThread(final Runnable r) {
    +                final Thread t = defaultFactory.newThread(r);
    +                t.setName("ExecuteStreamCommand " + getIdentifier() + " Task");
    +                return t;
    +            }
    +        });
    +    }
    +
    +    @OnUnscheduled
    --- End diff --
    
    @rkarthik29 
    
    OnUnscheduled runs before all the executing threads are complete, I think it would be
safer to use OnStopped to shutdown the executor


> ExecuteStreamCommand Needs to Consume Standard Error
> ----------------------------------------------------
>
>                 Key: NIFI-1856
>                 URL: https://issues.apache.org/jira/browse/NIFI-1856
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Alan Jackoway
>            Assignee: Karthik Narayanan
>
> I was using ExecuteStreamProcess to run certain hdfs commands that are tricky to write
in nifi but easy in bash (e.g. {{hadoop fs -rm -r /data/*/2014/05/05}})
> However, my larger commands kept hanging even though when I run them from the command
line they finish quickly.
> Based on http://www.javaworld.com/article/2071275/core-java/when-runtime-exec---won-t.html
I believe that ExecuteStreamCommand and possibly other processors need to consume the standard
error stream to prevent the processes from blocking when standard error gets filled.
> To reproduce. Create this as ~/write.py
> {code:python}
> import sys
> count = int(sys.argv[1])
> for x in range(count):
>     sys.stderr.write("ERROR %d\n" % x)
>     sys.stdout.write("OUTPUT %d\n" % x)
> {code}
> Create a flow that goes 
> # GenerateFlowFile - 5 minutes schedule 0 bytes size 
> # ExecuteStreamCommand - Command arguments /Users/alanj/write.py;100 Command Path python
> # PutFile - /tmp/write/
> routing output stream of ExecuteStreamCommand to PutFile
> When you turn everything on, you get 100 lines (not 200) of just the standard output
in /tmp/write.
> Next, change the command arguments to /Users/alanj/write.py;100000 and turn everything
on again. The command will hang.
> I believe that whenever you execute a process the way ExecuteStreamCommand is doing,
you need to consume the standard error stream to keep it from blocking. This may also affect
things like ExecuteProcess and ExecuteScript as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message