hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Bowen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-492) Global counters
Date Thu, 01 Mar 2007 16:39:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12477027
] 

David Bowen commented on HADOOP-492:
------------------------------------


> 1) Why does the method to increment a counter take an enum whereas the method to read
the value takes a String? 
> Wouldn't it be more convenient if Counters.getCounter() also took an enum? 

Yes it would.  The issue is that Counters objects move between processes, including back to
the client.  I don't think we can safely assume that the right Enum type will be available
everywhere.  

FYI I've changed the Counters API in a patch attached to Hadoop-1041, but it isn't any simpler
:-(.  Counters are now grouped by the enum type that they came from.

With regard to your test, it could be a bug.  It would be interesting to see if you get a
similar discrepancy after applying the 1041 patch.



> Global counters
> ---------------
>
>                 Key: HADOOP-492
>                 URL: https://issues.apache.org/jira/browse/HADOOP-492
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: arkady borkovsky
>         Assigned To: David Bowen
>             Fix For: 0.12.0
>
>         Attachments: counters1.patch, counters2.patch, counters3.patch
>
>
> It would be nice to have map / reduce job keep aggregated counts for arbitrary events
occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific
type, the number of sentences in passive voice, whatever the jobs finds useful.
> This can be implemented by tasks periodically sending <name, value> pairs to the
jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so
that the job tracker stores all the latests values from each task and aggregates them on a
request.  It should also make the aggregated values available at the job end.  The value for
a task would be flushed when the task fails.
> #491 and #490 may be related to this one.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message