hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guilin Sun (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1648) Use RollingFileAppender to limit tasklogs
Date Sun, 11 Apr 2010 13:16:41 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855720#action_12855720
] 

Guilin Sun commented on MAPREDUCE-1648:
---------------------------------------

Thanks for comment!

bq. From the first look of the patch, it seems to me that this behaviour is completely gone
and the jvm's output and error are now being redirected to /dev/null

Default stdout/stderr have been replaced by line 13, we use Log4jOutputStream instead of default
ones, so "/dev/null" will get nothing unless log4j init failed, I put "/dev/null" here is
to prevent child-jvm produce any outputs to real stdout/stderr.  (log4j/contrib/ provided
a LoggingOutputStream implementation to do same thing). 

bg. Even if we fix the above, I am not really sure this will solve the overall feature of
limiting logs simply because simply switching/rotating log files in the jvm process will not
do the same for the streaming/pipes tasks

Streaming use PipeMapper/PipeReducer  and their stderr will be catch by child-jvm and then
output to child-jvm' stderr, because we replaced default stdout and stderr of child-jvm, so
it works well with streaming, but pipes is not under test yet.

This issue is point to  'tail -c" and old "TaskLogAppender"(another version of "tail -c" in
fact) problem, and main benefits of this patch includes:
# No delay, and so will not lose logs when child exit abnormally.
# Prevent tasks produce too large logs in time rather than truncate logs till tasks finishes,

# Because of log4j, we can change log directory/file when starting a new task(for jvm-reuse),
so it is easy to control size by task.
# Redirect stdout/stderr by child-jvm, so do not need to change any client source code(include
streaming).





> Use RollingFileAppender to limit tasklogs
> -----------------------------------------
>
>                 Key: MAPREDUCE-1648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1648
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Guilin Sun
>            Priority: Minor
>         Attachments: syslog.patch
>
>
> There are at least two types of task-logs: syslog and stdlog
> Task-Jvm outputs syslog by log4j with TaskLogAppender, TaskLogAppender looks just like
"tail -c", it stores last N byte/line logs in memory(via queue), and do real output only if
all logs is commit and Appender is going to close.
> The common problem of TaskLogAppender and 'tail -c'  is keep everything in memory and
user can't see any log output while task is in progress.
> So I'm going to try RollingFileAppender  instead of  TaskLogAppender, use MaxFileSize&MaxBackupIndex
to limit log file size.
> RollingFileAppender is also suitable for stdout/stderr, just redirect stdout/stderr to
log4j via LoggingOutputStream, no client code have to be changed, and RollingFileAppender
seems better than 'tail -c' too.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message