hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1881) Improve TaskTrackerInstrumentation
Date Wed, 11 Aug 2010 02:11:19 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897114#action_12897114
] 

Matei Zaharia commented on MAPREDUCE-1881:
------------------------------------------

By necessity, do you mean "why should Hadoop provide this feature rather than letting users
implement it themselves"? The answer is pretty simple -- since many users will want to do
the same thing, it makes sense to put it into the platform instead of asking them all to reinvent
it. The goal of the JIRA process is not to minimize changes to Hadoop, it's to make Hadoop
better. One can imagine many useful instrumentation classes being written that people will
combine (already, lots of people are using the default metrics one).

I actually opened this issue because I'm working on a project where I want to programmatically
launch a TaskTracker with an extra instrumentation class on top of the ones the user configured
in mapred-site.xml. I could do it by setting the parameter to a composite class, and then
passing it the old parameter, but it felt more natural to add support for multiple instrumentation
objects and just append to the user's list. I care more about the second part of the issue
(statusUpdate callback) though, because my project can't work at all without that.

> Improve TaskTrackerInstrumentation
> ----------------------------------
>
>                 Key: MAPREDUCE-1881
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1881
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Matei Zaharia
>            Assignee: Matei Zaharia
>            Priority: Minor
>         Attachments: mapreduce-1881-v2.patch, mapreduce-1881-v2b.patch, mapreduce-1881.patch
>
>
> The TaskTrackerInstrumentation class provides a useful way to capture key events at the
TaskTracker for use in various reporting tools, but it is currently rather limited, because
only one TaskTrackerInstrumentation can be added to a given TaskTracker and this objects receives
minimal information about tasks (only their IDs). I propose enhancing the functionality through
two changes:
> # Support a comma-separated list of TaskTrackerInstrumentation classes rather than just
a single one in the JobConf, and report events to all of them.
> # Make the reportTaskLaunch and reportTaskEnd methods in TaskTrackerInstrumentation receive
a reference to a whole Task object rather than just its TaskAttemptID. It might also be useful
to make the latter receive the task's final state, i.e. failed, killed, or successful.
> I'm just posting this here to get a sense of whether this is a good idea. If people think
it's okay, I will make a patch against trunk that implements these changes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message