hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Bowen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-492) Global counters
Date Thu, 22 Feb 2007 21:51:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475172

David Bowen commented on HADOOP-492:

Thanks Andrzej and Doug for your comments.  There is one vote for changing the class name
to Counters, and none against, so unless anyone else wants to argue about it, I will switch
to Counters.

Re Doug's comments:

1. Reporter: I was thinking only about source compatibility, which is actually improved by
making this an abstract class so that code like this:

Reporter reporter = new Reporter() {
    // definitions of abstract methods

will still work because the new method (incrCounter(String name)) is not abstract.  However,
that is pretty unimportant, because it doesn't affect users, since they don't have any reason
to implement Reporter.  The down side of the change is that it breaks binary compatibility
- so that users would need to recompile their applications, and there isn't a good enough
reason for doing this.  So I will change it back to an interface.

2. I will remove the method that I commented out.

> Global counters
> ---------------
>                 Key: HADOOP-492
>                 URL: https://issues.apache.org/jira/browse/HADOOP-492
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: arkady borkovsky
>         Assigned To: David Bowen
>         Attachments: counters1.patch
> It would be nice to have map / reduce job keep aggregated counts for arbitrary events
occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific
type, the number of sentences in passive voice, whatever the jobs finds useful.
> This can be implemented by tasks periodically sending <name, value> pairs to the
jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so
that the job tracker stores all the latests values from each task and aggregates them on a
request.  It should also make the aggregated values available at the job end.  The value for
a task would be flushed when the task fails.
> #491 and #490 may be related to this one.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message