hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-751) Rumen: a tool to extract job characterization data from job tracker logs
Date Sun, 12 Jul 2009 04:05:14 GMT
Rumen: a tool to extract job characterization data from job tracker logs

                 Key: MAPREDUCE-751
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-751
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Dick King

 We propose a new map/reduce component, rumen, which can be used to process job history logs
to produce any or all of the following:

      * Retrospective info describing the statistical behavior of the
amount of time it would have taken to launch a job into a certain
percentage of the number of mapper slots in the log's cluster, given the
load over the period covered by the log

      * Statistical info as to the runtimes and shuffle times, etc. of
the tasks and jobs covered by the log

      * files describing detailed job trace information, and the
network topology as inferred from the host locations and rack IDs that
arise in the job tracker log.  In addition to this facility, rumen
includes readers for this information to return job and detailed task
information to other tools.

        These other tools include a more advanced version of gridmix, and also includes mumak:
see blocked issues.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message