hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksandr Balitsky (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MAPREDUCE-6778) Provide way to limit MRJob's stdout/stderr size
Date Thu, 15 Sep 2016 14:04:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493396#comment-15493396
] 

Aleksandr Balitsky edited comment on MAPREDUCE-6778 at 9/15/16 2:03 PM:
------------------------------------------------------------------------

[~jira.shegalov], thanks for review.
- Regarding circular dependencies: It should not happen, because stdOutLogger logger use appender
that directs logs in stdout file, but not in stdout stream.
- Stdout/stderr size limitation is optional, so if users override system out, we can't take
care about file's limitation as i know.

Logs can grow up to 10's -100's of GB, so they have to be restricted on the fly and we need
to see output during container's execution.

Thank you for you suggestion.


was (Author: abalitsky1):
[~jira.shegalov], thanks for review.
- Regarding circular dependencies: It should not happen, because my logger uses appender that
directs logs in stdout file, but not in stdout stream.
- Stdout/stderr size limitation is optional, so if users override system out, we can't take
care about file's limitation as i know.

Logs can grow up to 10's -100's of GB, so they have to be restricted on the fly and we need
to see output during container's execution.

Thank you for you suggestion.

> Provide way to limit MRJob's stdout/stderr size
> -----------------------------------------------
>
>                 Key: MAPREDUCE-6778
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6778
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Aleksandr Balitsky
>            Priority: Minor
>         Attachments: MAPREDUCE-6778.v1.001.patch
>
>
> We can run job with huge amount of stdout/stderr and causing undesired consequence.
> The possible solution is to redirect Stdout's and Stderr's output to log4j in YarnChild.java
main method.
> In this case System.out and System.err streams will be redirected to log4j logger with
 appender that will direct output in to stderr or stdout files with needed size limitation.
Thereby we are able to limit log's size on the fly, having one backup rolling file (thanks
to ContainerRollingLogAppender).
> One of the syslog's size limitation approaches works the same way.
> So, we can set limitation via new properties in mapred-site.xml:
> mapreduce.task.userlog.stderr.limit.kb
> mapreduce.task.userlog.stdout.limit.kb
> Advantages of such solution:
> - it allows us to restrict file sizes during job execution.
> - we can see logs during job execution.
> Disadvantages:
> - It will work only for MRs jobs.
> Is it appropriate solution for solving this problem, or is there something better?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message