hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Tue, 18 Nov 2008 09:21:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648541#action_12648541

Vivek Ratan commented on HADOOP-4035:

As Vinod has brought up, there are some edge cases and details missing in [the summary|https://issues.apache.org/jira/browse/HADOOP-4035?focusedCommentId=12644267#action_12644267]
that we need to cover. 

We want monitoring to work independent of scheduler support, i.e., even if the scheduler you're
using does not support memory-based scheduling, you may still want to make sure the TTs monitor
memory usage on their machines and kill tasks if too much memory is used. Based on what we've
described in the summary, the following three configuration settings are required for the
TT to do monitoring: {{mapred.tasktracker.virtualmemory.reserved}} (the offset of total VM
on the machine), {{mapred.task.default.maxvm}} (the default for maximum VM per task), and
{{mapred.task.limit.maxvm}} (the upper limit on the max VM per task). It is proposed that:

* if one or more of these three values are missing in the configuration, the TT disables monitoring
and logs an appropriate message. 
* At startup, the TT should also make sure that {{mapred.task.default.maxvm}} is not greater
than {{mapred.task.limit.maxvm}}. If it is, the TT logs a message and disables monitoring.
* if all three are present, the TT has enough information to compute the _max-VM-per-task_
limit for each task it runs and can successfully monitor memory usage.
* Without scheduler support, the TT can get a task whose _max-VM-per-task_ limit is higher
than {{mapred.task.limit.maxvm}} (i.e., the user-set value for a job's {{mapred.task.maxvm}}
can be higher than {{mapred.task.limit.maxvm}}). In such a case, the TT can choose to fail
the task, or it may still run the task while logging the problem. IMO, the former seems too
harsh and not something that the TT should possibly decide just based on its settings for
monitoring. In the latter case, the TT can still continue monitoring, but may end up killing
the wrong task if the sum of VMs used is over the _max-VM-per-node_ limit. I propose we do
the latter. 

The TT also needs to report memory information to the schedulers. As per HADOOP-3759, TTs
currently report, in each heartbeat, how much free VM they have (which is equal to _max-VM-per-node_
minus the sum of _max-VM-per-task_ for each running task). This makes sense if monitoring
is on, and the three necessary VM config values are defined. If they're not, and the TT cannot
determine its free VM, what should it report? 
* It can report -1, or some such value, indicating that it cannot compute free VM. 
* If we let schedulers decide how they want to behave in the absence of monitoring, or rather
in the absence of the  necessary VM config values being defined, a TT should always report
how much total VM (as well as RAM) it has, as well as its value for {{mapred.tasktracker.virtualmemory.reserved}}.

I propose the latter. TTs always report how much VM&RAM they have on their system, and
what offset settings they have. They're the only ones who have this information, and this
approach gives a lot of flexibility to the schedulers in terms of how to use that information.

What about schedulers? The Capacity Scheduler should do the following: 
* If any of the three mandatory VM settings are not set, it should not schedule based on VM
or RAM. The value of {{mapred.tasktracker.virtualmemory.reserved}} comes from the TT while
the other two can be read by the scheduler from its own config file. 
* If the mandatory VM values are set, as well as the mandatory RAM values ({{mapred.capacity-scheduler.default.ramlimit}},
{{mapred.task.limit.maxram}}), the scheduler uses both VM and RAM settings to schedule, as
defined in the earlier [summary|https://issues.apache.org/jira/browse/HADOOP-4035?focusedCommentId=12644267#action_12644267].

* If the mandatory VM values are set, but one or more of the mandatory RAM values are not,
the scheduler only uses VM values for scheduling. 

It's possible that other schedulers may choose a different algorithm. What's important is
that they have all the available information, which they should as per this proposal. 

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message