hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Fri, 31 Oct 2008 12:45:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644275#action_12644275
] 

Devaraj Das commented on HADOOP-4035:
-------------------------------------

On the scheduling, wouldn't it be nice if all schedulers could use this feature? One option
there is to implement the policy in all the schedulers. But given this issue is targetted
for 19.1, the other option is to do the check within the JobInProgress just like we do the
check for blacklisted TTs (where we don't give a task to a blacklisted TT). Specifically,
couldn't the JobInProgress.shouldRunOnTaskTracker method do this check and not assign the
TT a task taking into account the memory parameters (all the information related to memory
parameters are available at this point via TaskTrackerStatus and the jobconf of the job)?

One more line of argument could be that we are actually just doing greedy scheduling w.r.t
the memory related parameters. So this base level greedy scheduling should be in a place that
is in the code path of all schedulers, i.e., in the JobInProgress.shouldRunOnTaskTracker.
If some scheduler tries to do something better than that, they always can do so since control
is given to the scheduler code first (assignTasks).
Thoughts?

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message