[ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Bowen updated HADOOP-492:
-------------------------------
Attachment: counters.patch
Here's a patch for review. Some issues and notes:
* I haven't managed to test this properly with LocalJobRunner because (I think) the namenode
keeps throwing SafeModeException. Tips on how to resolve this would be appreciated.
* I called the main counters class Statistics. Maybe it should be called Counters?
* I added a couple of counters to the WordCount example. If it is preferred to keep that
example minimalist, these don't need to go there.
* From the job info page you can navigate to per-tip and per-task counters if you are interested.
* JobInProgress sends the per-job counters to the metrics package whenever it updates them.
> Global counters
> ---------------
>
> Key: HADOOP-492
> URL: https://issues.apache.org/jira/browse/HADOOP-492
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Reporter: arkady borkovsky
> Assigned To: David Bowen
> Attachments: counters.patch
>
>
> It would be nice to have map / reduce job keep aggregated counts for arbitrary events
occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific
type, the number of sentences in passive voice, whatever the jobs finds useful.
> This can be implemented by tasks periodically sending <name, value> pairs to the
jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so
that the job tracker stores all the latests values from each task and aggregates them on a
request. It should also make the aggregated values available at the job end. The value for
a task would be flushed when the task fails.
> #491 and #490 may be related to this one.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
|