hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1648) Use RollingFileAppender to limit tasklogs
Date Sun, 11 Apr 2010 11:28:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855707#action_12855707

Vinod K V commented on MAPREDUCE-1648:

Sorry for overlooking this issue till far.

bq. When reviewing the patch, please test the performance and make sure we don't re-introduce
the slowness observed at HADOOP-1553.
This is a must.

Even before that, I don't think this approach will work at all when we consider streaming
and pipes jobs also. So far, we have been capturing stderr of streaming tasks and stdout and
stderr of pipes jobs into stdout/stderr files in userlogs/$attemptid/. From the first look
of the patch, it seems to me that this behaviour is completely gone and the jvm's output and
error are now being redirected to /dev/null. The same behaviour for the subsequently spawned
streaming or pipes processes is expected but seems thrown away in this patch.

Even if we fix the above, I am not really sure this will solve the overall feature of limiting
logs simply because simply switching/rotating log files in the jvm process will not do the
same for the streaming/pipes tasks. I think the approach outlined at MAPREDUCE-1100 for truncating
logs after the tasks finishes is the solutions given all these limitations. Thoughts?

> Use RollingFileAppender to limit tasklogs
> -----------------------------------------
>                 Key: MAPREDUCE-1648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1648
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Guilin Sun
>            Priority: Minor
>         Attachments: syslog.patch
> There are at least two types of task-logs: syslog and stdlog
> Task-Jvm outputs syslog by log4j with TaskLogAppender, TaskLogAppender looks just like
"tail -c", it stores last N byte/line logs in memory(via queue), and do real output only if
all logs is commit and Appender is going to close.
> The common problem of TaskLogAppender and 'tail -c'  is keep everything in memory and
user can't see any log output while task is in progress.
> So I'm going to try RollingFileAppender  instead of  TaskLogAppender, use MaxFileSize&MaxBackupIndex
to limit log file size.
> RollingFileAppender is also suitable for stdout/stderr, just redirect stdout/stderr to
log4j via LoggingOutputStream, no client code have to be changed, and RollingFileAppender
seems better than 'tail -c' too.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message