hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-220) Collecting cpu and memory usage for MapReduce tasks
Date Tue, 04 May 2010 17:11:07 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863894#action_12863894

Vinod K V commented on MAPREDUCE-220:

bq. We probably should convert the fields in ResourceStatus into Counters and use that as
the primary interface for the end-user, also we should store them into JobHistory etc.
I second that. It will also solve two other issues with the patch:
 - the cpu and memory usage details of each task are sent in every heartbeat, making it bulky.
Translating them into Counters will make them to be sent only once every minute
 - with Counters, we get for free the logging into JobHistory as well displaying on the web

Leaving that aside, I have one more comment on the TT side: For getting the cpu/memory usage
of a task, we construct the process-tree of the task repeatedly every time a heartbeat is
 - For one, if we go the Counters way, we only need to do the calculations every once a minute.
 - Otherwise, the process-trees for all tasks are now constructed by both by TaskMemoryManager
and the TT main thread. It can become costly depending on the size of the process-tree. There
is an opportunity for refactoring this, I guess - may be a single class which maintains all
the process-trees (TaskMemoryManager.ProcessTreeInfo?) and the corresponding statistics, within
a given precision, time-wise.


> Collecting cpu and memory usage for MapReduce tasks
> ---------------------------------------------------
>                 Key: MAPREDUCE-220
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-220
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: task, tasktracker
>            Reporter: Hong Tang
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-220-v1.txt, MAPREDUCE-220.txt
> It would be nice for TaskTracker to collect cpu and memory usage for individual Map or
Reduce tasks over time.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message