hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Fri, 28 Nov 2008 14:23:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Vinod K V updated HADOOP-4035:

    Attachment: HADOOP-4035-20081128-4.txt

Attaching a new patch.
 - killJobsWithInvalidRequirements is now done in scheduler.jobAdded() itself. It is O(1)
now. Invalid jobs get rejected rightaway, jobAdded throws an IOException and this exception
gets propagated to the client with a message.
 - Documented the configuration properties in hadoop-default.xml and capacity-scheduler-conf.xml.
 - Leaving the config names to be "reserved" instead of "excluded" as they truly specified
what are reserved by TT for its own and system's usage.
 - Incorporated the rest of the comments.

ant test-patch gave all +1 except a -1 for findBugs. This is regarding unsynchronized access
to TaskTrackerStatus object. This patch does the right thing, but there are other earlier
unsynchronized accesses. The changes in this patch triggered an old findBugs warning.

I've run core and contrib tests. They both built successfully.

Things to be done later in other jiras
 - Caching the value for the jobId -> value for running jobs.
 - Information on UI stating the reason why a job is killed when it specifies invalid requirement.
 - TT reports total and reserved memory values. These values don't change in general, so if
possible, they can be reported only once.
 - Resource information should be visible on TTs UI pages.
 - Configuration should support specifying values in KB, MB, GB etc.
 - In case of no scheduler support, TaskTrackerMemoryManager thread should fail jobs that
specify their memory requirements to be more than the cluster-wide upper limits. It currently
just logs that such that happened and silently ignores it (See TODO in TaskTrackerMemoryManager
 - Need forrest documentation of how capacity scheduler deals with memory based scheduling.

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt, HADOOP-4035-20081121.txt, HADOOP-4035-20081126.1.txt,
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message