hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksandr Balitsky (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-5619) Provide way to limit MRJob's stdout/stderr size
Date Tue, 06 Sep 2016 13:44:21 GMT
Aleksandr Balitsky created YARN-5619:

             Summary: Provide way to limit MRJob's stdout/stderr size
                 Key: YARN-5619
                 URL: https://issues.apache.org/jira/browse/YARN-5619
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: log-aggregation, nodemanager
    Affects Versions: 2.7.0
            Reporter: Aleksandr Balitsky
            Priority: Minor

We can run job with huge amount of stdout/stderr and causing undesired consequence. There
is already a Jira which is been open for while now:

The possible solution is to redirect Stdout's and Stderr's output to log4j in YarnChild.java
main method via commands:

System.setErr( new PrintStream( new LoggingOutputStream( <Log4j logger>, Level.ERROR
), true));
System.setOut( new PrintStream( new LoggingOutputStream( <Log4j logger>, Level.INFO
), true));

In this case System.out and System.err will be redirected to log4j logger with appropriate
appender that will direct output to stderr or stdout files with needed size limitation. 

Advantages of such solution:
- it allows us to restrict file sizes during job execution.
- It will work only for MRs jobs.
- logs are stored in memory and are flushed on disk only after job's finishing (syslog works
the same way) - we can loose logs if container will be killed or failed.

Is it appropriate solution for solving this problem, or is there something better?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message