hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled
Date Mon, 27 Oct 2008 07:56:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642871#action_12642871
] 

Vivek Ratan commented on HADOOP-4523:
-------------------------------------

HADOOP-3759 provides a configuration value, _mapred.tasktracker.tasks.maxmemory_, which specifies
the total VM on a machine available to tasks spawned by the TT. Along with HADOOP-4439, it
provides a cluster-wide default for the maximum VM associated per task, _mapred.task.default.maxmemory_.
This value can be overridden by individual jobs. HADOOP-3581 implements a monitoring mechanism
that kill tasks if they go over their _maxmemory_ value. Keeping all this in mind, here's
a proposal for what we need to additionally do: 

If _tasks.maxmemory_ is set, the TT monitors the total memory usage of all tasks spawned by
the TT. If this value goes over _tasks.maxmemory_, the TT needs to kill one or more tasks.
It first looks for tasks whose individual memory is over their _default.maxmemory_ value.
These are killed (while you may ideally want to kill just enough that your total memory usage
comes down, it's not obvious which of these violators you choose to kill, so it's probably
simpler to kill all). If no such task is found, or if killing one or more of these tasks still
takes us over the memory limit, we need to pick other tasks to kill. There are many ways to
do this. Probably the easiest is to kill tasks that ran most recently. 

Tasks that are killed because they went over their memory limit should be treated as failed,
since they violated their contract. Tasks that are killed because the sum total of memory
usage was over a limit should be treated as killed, since it's not really their fault. 

Another improvement is to let _mapred.tasktracker.tasks.maxmemory_ be set by an external script,
which lets Ops control what this value should be. A slightly less desirable option, as indicated
in some offline discussions with Alan W, is to set this value to be an absolute number ("hadoop
may use X amount") or an offset of the total amount of memory on the machine ("hadoop may
use all but  4g"). 

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: Vivek Ratan
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage
of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage
goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving
jobs from bringing down nodes. What is also needed is the ability to make sure that the sum
total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message