hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Fri, 26 Sep 2008 07:21:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634777#action_12634777

Hemanth Yamijala commented on HADOOP-4035:

Following an offline discussion with Owen, his proposal was the following:

- The scheduler assigns a task to a TT only if the amount of free memory reported is greater
than the task's requirements.
- If it doesn't match, we don't move to the next job. That is, we block, thus removing any
possible starvation of this job.
- We don't bother about making this job account for more usage at this point, and handle that
problem later, mostly after 0.19.

Thinking about this, I think the only disadvantage with this approach is that a user who submits
a job with high memory requirements could essentially block other users, atleast until his
limit is hit.

So, I would suggest we change the above proposal to not block, but instead move over to the
next job. This way, a user with high RAM requirements cannot block other users, and cannot
game the system in that way.

Note that:
- This is exactly what we do in HADOOP-657 for disk space usage.
- When we introduce accounting, we can also change the behavior of blocking.

Can we agree on this ?

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message