hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats
Date Tue, 22 Dec 2009 00:07:18 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dick King updated MAPREDUCE-1309:
---------------------------------

    Attachment: demuxer-plus-concatenated-files--2009-12-21.patch

This patch implements a universal gridmix3/mumak trace generator.

It differs from previous versions of rumen in three ways:

1: The mainclass is o.a.h.tools.rumen.Driver

2: This tool is specialized to make traces.  Future statisticsengines will be trace-based

3: The argument list is more austere.  There are three or more arguments:

3a: the trace output, a {{Path}} , compressed or not

3b: the topology output, again a {{Path}} , again compressed or not

3c: any number of {{Path}} names, each of which can be compressed or not, and each of which
can be a config.xml file, a job tracker log [ {{Driver}} determines the version ], or a directory
filled with such files.



> I want to change the rumen job trace generator to use a more modular internal structure,
to allow for more input log formats 
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1309
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1309
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Dick King
>            Assignee: Dick King
>         Attachments: demuxer-plus-concatenated-files--2009-12-21.patch
>
>
> There are two orthogonal questions to answer when processing a job tracker log: how will
the logs and the xml configuration files be packaged, and in which release of hadoop map/reduce
were the logs generated?  The existing rumen only has a couple of answers to this question.
 The new engine will handle three answers to the version question: 0.18, 0.20 and current,
and two answers to the packaging question: separate files with names derived from the job
ID, and concatenated files with a header between sections [used for easier file interchange].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message