hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-740) Provide summary information per job once a job is finished.
Date Thu, 09 Jul 2009 20:20:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729400#action_12729400

Hong Tang commented on MAPREDUCE-740:


I do have a specific usage case where we want to keep track of the amount of resources being
used by each job, each user, or each queue (for capacity scheduler). Granted, all these information
is readily available in job history log. However, there are a few drawbacks by depending on
job history logs: (1) we are interested in keeping a history of finished and possibly do group-by
for user and queue. so scrapping individual history log is messy; (2) the added dependency
to keep up with possible future changes to the history log format.

For starter, I think the summary should include the following information: 
	- job queuing/waiting time
	- job start time
	- job finish time
	- total maps/reduces
	- user id
	- job id (job-tracker ID + job sequence number)
	- map/reduce slot hours (need to apply multiplier for high ram tasks that take multiple slots
per map/reduce task)
	- queue name
	- job status (success or failure)
	- cluster map/reduce slot capacity

The only thing that job history log does not provide currently is the slot hours for all maps
and reduces belonging to the same job.

> Provide summary information per job once a job is finished.
> -----------------------------------------------------------
>                 Key: MAPREDUCE-740
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-740
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Hong Tang
>            Priority: Minor
> It would be nice if JobTracker can output a one line summary information per job once
a job is finished. Otherwise, users or system administrators would end up scraping individual
job history logs.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message