hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers
Date Mon, 02 Nov 2009 09:11:59 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Vinod K V updated MAPREDUCE-1100:

    Attachment: MAPREDUCE-1100-20091102.txt

Attaching a first patch.

Introducing the following configuration items:
 - Job Configuration:
    -- {{JobContext.MAP_USERLOG_LIMIT}} : Per task limit on how much each log file can grow
to. Used by {{killRunningTasksOverLimit()}} for killing tasks that write excessive logging.
    -- {{JobContext.REDUCE_USERLOG_LIMIT}} : Same as above for reduces.
    -- {{JobContext.MAP_USERLOG_RETAIN_SIZE}} : Per task configuration of how much tail of
the each log file has to be retained. Each task-log file is truncated to this amount after
the task finishes. Used by {{truncateLogsOfFinishedTasks()}}
    -- {{JobContext.REDUCE_USERLOG_RETAIN_SIZE}} : Same as above for reduces.

 - TT configuration
    -- {{TTConfig.TT_USERLOG_RETAIN_HOURS}} : TT configuraton of how long logs of each finished
task has to be retained on this TT. Used by {{retireOldLogs()}} to cleanup very old logs.
    -- {{TTConfig.TT_USERLOG_CUMULATIVE_LIMIT}} : TT configuration limiting the total usage
of log files across all tasks. If the total usage grows beyond this limit, {{removeOldFilesToControlCumulativeUsage()}}
removes old log files irrespective of their age w.r.t {{TTConfig.TT_USERLOG_RETAIN_HOURS}}.

Moved clean-up of task-logs from child into TaskLogsMonitor which does the following:
while(true) {

  retireOldLogs(); // remove very old logs

  truncateLogsOfFinishedTasks(); // truncate finished tasks' logs. Also set no-writable permissions.

  killRunningTasksOverLimit(); // kill tasks going over per-task per-file limit

  removeOldFilesToControlCumulativeUsage(); // remove very old logs if total usage is alarming
irrespective of retain.hours

> User's task-logs filling up local disks on the TaskTrackers
> -----------------------------------------------------------
>                 Key: MAPREDUCE-1100
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
>            Assignee: Vinod K V
>         Attachments: MAPREDUCE-1100-20091102.txt
> Some user's jobs are filling up TT disks by outrageous logging. mapreduce.task.userlog.limit.kb
is not enabled on the cluster. Disks are getting filled up before task-log cleanup via mapred.task.userlog.retain.hours
can kick in.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message