hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-901) Move Framework Counters into a TaskMetric structure
Date Mon, 24 Aug 2009 19:02:00 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Devaraj Das updated MAPREDUCE-901:

    Attachment: 901_1.patch

Attaching a patch for review. I am still testing the patch. Also, a little bit of cleanup
is required especially w.r.t to naming variables/fields in the classes. I will do that in
a follow up patch.

Some points on the approach:
1) Defined a class TaskMetrics that has methods for updating the counters defined in o.a.h.mapreduce.TaskCounter.java.
It also provides a utility method to update framework Counters that aren't defined in TaskCounter.java.
Examples of such counters are the counters that the framework defines in the countergroup
FileSystemCounters. For the TaskCounter counters, the RPC is optimized. For the framework
counters like the FileSystemCounters, RPC uses the Counters serialization. 
2) The above is serialized out as part of TaskStatus object in the heartbeats.
3) In TaskInProgress.java, the TIP's Counters is updated with the above counters obtained
in the heartbeat.

Would really appreciate a review on this one.

And yes, this looks like a good thing to have for the jiras MAPREDUCE-220 and MAPREDUCE-718.

> Move Framework Counters into a TaskMetric structure
> ---------------------------------------------------
>                 Key: MAPREDUCE-901
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>             Fix For: 0.21.0
>         Attachments: 901_1.patch
> I think we should move all of the Counters that the framework updates into a single class
called TaskMetrics. TaskMetrics would have specific fields for each of the metrics like input
records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the Counters
down to just the user's counters) and decrease the latency for updates to the JobTracker (since
Counters are sent at most 1/minute instead of 1/heartbeat).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message