hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-961) ResourceAwareLoadManager to dynamically decide new tasks based on current CPU/memory load on TaskTracker(s)
Date Wed, 14 Oct 2009 20:46:31 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765743#action_12765743
] 

Scott Chen commented on MAPREDUCE-961:
--------------------------------------

@Vinod:

Thank you for the suggestions. Combining the resource monitoring daemon in TaskTracker and
the Collector in Jobtracker is a really good idea. 
I just repeat your points to see if I get them:
1. A lot of codes/logic can be reused such as the HeartBeats mechanism.
2. Information can be more cohesive (TaskTrackerStatus.ResourceStatus holds all the utilization
information)
3. Monitoring daemon can access information of the TaskTracker (taskid, jobid...)
4. Collector can access information of the JobTracker (jobid, user, #map tasks, #reduce tasks...)

The reason why we built them as separate daemons is mainly because we want this to run on
multiple map-reduce clusters as Dhruba mentioned.
Also, at this stage, it is easy to test these daemons without the dependency on JT or TT.
We can easily change/restart these daemons without affecting the map-reduce cluster.

I will definitely study how to put these daemons inside TT and JT. I think one possibility
is that we build them inside TT and JT but still provide the RPC interface in Collector.
If we need information on multiple clusters, we can go to the corresponding Collectors and
get them via RPC.

@Dhruba:

Thanks. Reused the code in ProcfsBasedProcessTree is a good idea. But this class does not
provide the CPU usage information.
I will see how to reuse this class to get both information.


> ResourceAwareLoadManager to dynamically decide new tasks based on current CPU/memory
load on TaskTracker(s)
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-961
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-961
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: HIVE-961.patch
>
>
> Design and develop a ResouceAwareLoadManager for the FairShare scheduler that dynamically
decides how many maps/reduces to run on a particular machine based on the CPU/Memory/diskIO/network
usage in that machine.  The amount of resources currently used on each task tracker is being
fed into the ResourceAwareLoadManager in real-time via an entity that is external to Hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message