hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Fri, 10 Oct 2008 15:01:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638574#action_12638574
] 

Hemanth Yamijala commented on HADOOP-4035:
------------------------------------------

Some comments:

JobConf:
- I think it is OK to expose whether memory based scheduling is enabled as an API.

CapacityTaskScheduler:
- {{jobFitsOnTT}}: if job has not requested for any memory, we promise it atleast defaultMemoryPerSlot
on TT. So, I think this method should still check for that part.
- Since we already have a map/reduce based {{TaskSchedulingMgr}}, can we implement {{jobFitsOnTT}}
to not have checks based on whether it's map or reduce task ? One way to do that would be
to define an abstract {{getFreeVirtualMemoryForTask()}} in {{TaskSchedulingMgr}} and implement
it in the {{MapSchedulingMgr}} to return {{resourceStatus.getFreeVirtualMemoryForMaps()}}
and so on.
- {{InAdequateResourcesException}} should be {{InadequateResourcesException}}. Does it need
to extend IOException ?
- {{updateResourcesInformation}}: If for any one TT there is DISABLED_VIRTUAL_MEMORY_LIMIT,
we don't need to proceed in the loop - a small optimization ? 
- Also, this need not be done if memory management is disabled.
- jip.isKillInProgress() -- I think this is going to be changed. Will this trigger {{jobCompleted}}
events ? This should be checked with the solution of HADOOP-4053.

- Can we somehow avoid duplicating the following code between {{CapacityTaskScheduler}} and
{{JobQueueTaskScheduler}}:
-- jobFitsOnTT
-- updateResourcesInformation()
-- killing of jobs
It is significant logic and avoiding code duplication might help.

I need to review the changes to the testcases still.






> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message