hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Tue, 30 Sep 2008 05:09:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635660#action_12635660

Hemanth Yamijala commented on HADOOP-4035:

bq. It is possible to have two different types of machine in the same cluster.... the only
difference being the amount if memory on these types. Since the CPU capacity is the same,
I would ideally configure both types of machines to have the same number of slots.

Dhruba, agreed. This was also indicated in Owen's comments above. So, this assumption is no
longer valid. We had this assumption to help us map tasks to slots easily. This in turn was
to meet the 2nd requirement I'd put up above: 

bq. A user whose job requests for higher resources than usual would decrease the free memory
on the tasktracker more than other jobs would. Therefore the user must be 'charged' the additional
usage so that he would hit his limits and capacities sooner.

But as I described above, we would like to keep things simple, and not do this mapping for
now. It is a little less fair, but we can try out how it works.

bq. It would be nice if the JT/TT can compute the memory capacity per slot and then schedule
tasks accordingly.

HADOOP-3759 laid down the framework for the TT to do this. This JIRA will address the scheduling

bq. Also, the JT scheduler can generate more affinity of reduce tasks to slots with larger
memory-capacity-per-slot because reduce tasks possibly take more memory than map tasks.

We do not differentiate specifications between memory requirements of map and reduce slots
currently. Does this seem vital to have ?

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message