hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Bowen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1041) Counter names are ugly
Date Thu, 01 Mar 2007 04:00:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

David Bowen updated HADOOP-1041:

    Attachment: 1041.patch

This patch implements the grouped display of counters.  The individual counters are in the
order that they were declared in the originating Enum class, so the developer has full control
over the ordering.

Optionally, a ResourceBundle may be provided.  (I had coded this before people started to
vote that it wasn't necessary.)  The resource bundle goes in the same package directory as
the class containing the enum, and must have the name <class name>_<enum name>.properties.
 E.g. for the MapReduce counters, I created an enum called Counter in Task.java, so the bundle
is Task_Counter.properties.  See that file for how to customize both the group name and the
counter names.

There are also some changes to address Hadoop-1048.  Now the task counters are not summed
every time a task status gets updated.  Instead, they are summed when someone - either a client,
a JSP page, or a callback from the metrics package - requests them.  I changed JobSubmissionProtocol
to allow fetching the counters on demand.  I bumped up its version, which I guess I should
have also done on the previous patch in which the Counters object was included in the JobStatus.
 (I could have kept it there, but it seemed a bit inconsistent now that the counters are computed
on demand, since everything else in the JobStatus is kept up-to-date.)

A related JobTracker efficiency change was to stop updating the job metrics (via the metrics
package) every time a task update is received.  Instead this is now only done when the metrics
timer-based callback occurs (see JobTrackerMetrics.doUpdates()).  This means that this callback
needs to get a list of the running jobs - I think I implemented that with the correct locking
in the new method getRunningJobs, but someone might want to double check.

Changes affecting TastTracker:

The incrementCounter method should now be somewhat more efficient because it doesn't (normally)
involve any String or Long construction.  The Counters object now holds a map of maps, where
the enum class name is the index into the first.  (Previously it was just a map, so the key
had to be constructed by string concatenation.)  The serialized form of Counter is a bit more
concise, since the enum class name is only written once.

I didn't change the PROGRESS_INTERVAL (1  second) at which MapTasks report their progress
to the TaskTracker, because I don't think it is relevant to Hadoop-1048 which is a JobTracker

JSP stuff:

There is a change that was requested in Hadoop-1038, namely to show both the map and reduce
phase counters on the main jobdetails page.

I added a refresh parameter to jobdetails so that a refresh time in seconds can be specified
(it was refreshing evey 60 seconds).  I changed the jobtracker page to use the refresh parameter
so that running jobs by default get refreshed every 10 seconds, but completed and failed jobs
don't get refreshed.

The counts are now right-justfied and decimal-formatted.

> Counter names are ugly
> ----------------------
>                 Key: HADOOP-1041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1041
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>         Assigned To: David Bowen
>             Fix For: 0.12.0
>         Attachments: 1041.patch
> Having the complete class name in the counter names makes them unique, but they are ugly
to present to non-developers. It would be nice to have some way to have a nicer string presented
to the user. Currently, the Enum is converted to a name like:
> key.getDeclaringClass().getName() + "#" + key.toString()
> which gives counter names like "org.apache.hadoop.examples.RandomWriter$Counters#BYTES_WRITTEN"
> which is unique, but not very user friendly. Perhaps, we should strip off the class name
for presenting to the users, which would allow them to make nice names. In particular, you
could define an enum type that overloaded toString to print a nice user friendly string.
> Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message