hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.
Date Tue, 15 Sep 2009 17:14:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755585#action_12755585
] 

eric baldeschwieler commented on MAPREDUCE-157:
-----------------------------------------------

Doug, so your patch will store binary AVRO?  If we can convert, we should probably convert
all the way to native AVRO.  That will be more tested / common than text AVRO and presumably
require less storage.  Presumably AVRO will have a full toolset for dumping / exploring files
like this, so the binary format should not be a problem?

> Job History log file format is not friendly for external tools.
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-157
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 0.20.1
>            Reporter: Owen O'Malley
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.21.0
>
>         Attachments: mapred-157-10Sep.patch, mapred-157-15Sep-v1.patch, mapred-157-15Sep.patch,
mapred-157-4Sep.patch, mapred-157-7Sep-v1.patch, mapred-157-7Sep.patch, mapred-157-prelim.patch,
MAPREDUCE-157-avro.patch, MAPREDUCE-157-avro.patch
>
>
> Currently, parsing the job history logs with external tools is very difficult because
of the format. The most critical problem is that newlines aren't escaped in the strings. That
makes using tools like grep, sed, and awk very tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message