hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-751) Rumen: a tool to extract job characterization data from job tracker logs
Date Wed, 26 Aug 2009 22:40:59 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Dick King updated MAPREDUCE-751:

    Attachment: 2009-08-26--1513-patch.patch

This is a new patch for rumen.  It replaces the previous one, incorporating the comments raised
by test-patch.

Here is the new test-patch output summary:


     [exec] -1 overall.  
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]     +1 tests included.  The patch appears to include 38 new or modified tests.
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]     -1 javac.  The applied patch generated 2226 javac compiler warnings (more
than the trunk's current 2220 warnings).
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]     -1 release audit.  The applied patch generated 215 release audit warnings
(more than the trunk's current 202 warnings).

The javac warnings are deprication warnings.  We are using JobConf in this version of rumen.
 We expect to fix this in a future release to use the new interface.

The release audit warnings are places we don't have the Apache License.  These are .json input
files used in the test cases.  JSON does not define a comment format.  Although some JSON
engines have one, obviously if we used one that would kill flexibility for little gain.

I fixed the TestZombieJob code.  These were the tests of the new code that failed.  The other
failed tests were in streaming; a known source of test failures.

> Rumen: a tool to extract job characterization data from job tracker logs
> ------------------------------------------------------------------------
>                 Key: MAPREDUCE-751
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-751
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.20.1, 0.21.0
>         Attachments: 2009-08-19--1030.patch, 2009-08-26--1513-patch.patch, mapreduce-751--2009-07-23.patch
>  We propose a new map/reduce component, rumen, which can be used to process job history
logs to produce any or all of the following:
>       * Retrospective info describing the statistical behavior of the
> amount of time it would have taken to launch a job into a certain
> percentage of the number of mapper slots in the log's cluster, given the
> load over the period covered by the log
>       * Statistical info as to the runtimes and shuffle times, etc. of
> the tasks and jobs covered by the log
>       * files describing detailed job trace information, and the
> network topology as inferred from the host locations and rack IDs that
> arise in the job tracker log.  In addition to this facility, rumen
> includes readers for this information to return job and detailed task
> information to other tools.
>         These other tools include a more advanced version of gridmix, and also includes
mumak: see blocked issues.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message