hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory
Date Wed, 26 Nov 2008 11:54:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650970#action_12650970
] 

Hemanth Yamijala commented on HADOOP-4035:
------------------------------------------

I've started looking at this patch. Here are a few initial comments. I am still to look at
scheduling and test cases:

Configuration:
- We can now introduce the config variables back into hadoop-defaults.xml
- I think the variables should be in bytes. As Doug mentioned in comments above, we should
move to supporting formats mentioning units like 'KB' in a separate JIRA. When we do that,
it makes more sense to say that if no unit is specified, it is the lowest possible value which
will be bytes. Hence treating it as bytes here will  support backwards compatibility easily.
- As I've mentioned above, I still recommend changing the term 'reserved' to 'excluded'. Also,
I would recommend consistent names for the variables.. for e.g. we can use pmem and vmem everywhere
to indicate physical and virtual memory.
- In the javadoc for the JobConf variables we should have a note asking readers to refer to
the documentation of the scheduler being used to see how it does memory based scheduling.

Monitoring:
- Can we change the TODO in TaskMemoryManagerThread to remove the "I'm not comfortable..."
part. We should still explain the alternative that you've mentioned in the comment, though.
- We should also do a sanity check that the reserved limit is < the total memory, and turn
off monitoring if it's not.
- MemoryCalculatorPlugin requires ASF, Likewise LinuxMemoryCalculatorPlugin

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements
and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, HADOOP-4035-20081006.1.txt,
HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt, HADOOP-4035-20081121.txt, HADOOP-4035-20081126.1.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify memory requirements
for jobs, and also modified the tasktrackers to report their free memory. The capacity scheduler
in HADOOP-3445 should schedule tasks based on these parameters. A task that is scheduled on
a TT that uses more than the default amount of memory per slot can be viewed as effectively
using more than one slot, as it would decrease the amount of free memory on the TT by more
than the default amount while it runs. The scheduler should make the used capacity account
for this additional usage while enforcing limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message