hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers
Date Tue, 13 Oct 2009 08:23:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764999#action_12764999

Vinod K V commented on MAPREDUCE-1100:

The available features in the framework which can be used at all for limiting user logs are
 - User-log's limit via mapreduce.task.userlog.limit.kb
 - Log cleanup via mapred.task.userlog.retain.hours

mapreduce.task.userlog.limit.kb is not usable in the current format because of its limitations:
 - If this is used, showing the userlogs is not possible until tasks finish or fail. This
is not acceptable.
 - The stdout/stderr files are controlled by using 'tail -c' on the stdout/stderr of the task-jvm.
This tail command uses some of the precious memory allocated to the users, which is not accounted
or controlled anywhere.
 - syslog files are written to by tasks but the files themselves can be arbitrarily written
to by the jvm and its child processes without respecting any of these limits.

mapred.task.userlog.retain.hours cannot completely solve the issue because
 - it only takes into the account the amount of time the logs have to be retained *and not*
the disk usage
 - because of MAPREDUCE-927, the cleanup mechanism itself is not guaranteed even in terms
of time.

We should have a concrete mechanism to limit the amount of disk logs.

> User's task-logs filling up local disks on the TaskTrackers
> -----------------------------------------------------------
>                 Key: MAPREDUCE-1100
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
> Some user's jobs are filling up TT disks by outrageous logging. mapreduce.task.userlog.limit.kb
is not enabled on the cluster. Disks are getting filled up before task-log cleanup via mapred.task.userlog.retain.hours
can kick in.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message