hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jothi Padmanabhan (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-157) Job History log file format is not friendly for external tools.
Date Fri, 04 Sep 2009 11:13:57 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jothi Padmanabhan updated MAPREDUCE-157:

    Attachment: mapred-157-4Sep.patch

Patch for changing History to use JSON format.  Some notes about the patch:

All history information are logged using events. 
A version event is prepended to all history files.
History viewer and History Parser have been cleaned up and duplication of code in the jsp
files and HistoryViewer has been removed.
History files are named JobID_username. Filters on the UI page will now be based only on JobID
and User name
History Viewer now takes a history file as an argument instead of output directory
All events are made up of new API objects, including counters. As a result I had to open up
a couple of constructors in Counters to public.

Hadoop-Vaidya has been changed to use the new History Viewer, but has not been tested with
A temporary fix has been put for Rumen to get it compiled, it still works only with the old
history format and not the new one.

> Job History log file format is not friendly for external tools.
> ---------------------------------------------------------------
>                 Key: MAPREDUCE-157
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 0.20.1
>            Reporter: Owen O'Malley
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.21.0
>         Attachments: mapred-157-4Sep.patch, mapred-157-prelim.patch, MAPREDUCE-157-avro.patch
> Currently, parsing the job history logs with external tools is very difficult because
of the format. The most critical problem is that newlines aren't escaped in the strings. That
makes using tools like grep, sed, and awk very tricky.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message