hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Mon, 27 Oct 2008 06:40:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642861#action_12642861

Vivek Ratan commented on HADOOP-4035:

Here's a summary of what this patch should be doing. 

TTs report the amount of free memory available on their node (as described in HADOOP-3759
and HADOOP-4439), which is equal to the total VM assigned for Hadoop tasks on that node (_mapred.tasktracker.tasks.maxmemory_)
minus the VM guaranteed to the tasks that are already running. The CapacityScheduler looks
at this free memory to decide if it can run a task, which has its own memory needs. If the
task requires more memory than is available on the TT, the Scheduler returns nothing to the
TT (thus forcing it to finish up what it is running and eventually having enough free memory).

In order to make sure that no job asks for memory that is more than what a TT has available,
we should have a cluster-wide limit on the amount of VM a job can ask for its tasks. If this
limit is set, and a job asks for too much, the job should not be accepted by the JT in submitJob().
If the limit is not set, jobs cannot be rejected based on their memory requirements. 

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message