hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-157) Job History log file format is not friendly for external tools.
Date Fri, 14 Aug 2009 22:43:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743476#action_12743476

eric baldeschwieler commented on MAPREDUCE-157:

I think using AVRO is interesting down the road.  It seems too close to release and to early
in AVROs life to do this now.

Can we move forward with this as planned and then file another bug for the AVRO conversion.
 I think that will take some more discussion.

A union schema seems kind of heavy.  What we would want here is a schema for each event type,
so a parser could throw off a sequence of objects.
Is a sequence of AVRO records of mixed types something that is easy to express in avro?  One
could clearly do it by having a sequence
of "<type> : avro record\n" lines.

> Job History log file format is not friendly for external tools.
> ---------------------------------------------------------------
>                 Key: MAPREDUCE-157
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Owen O'Malley
>            Assignee: Jothi Padmanabhan
> Currently, parsing the job history logs with external tools is very difficult because
of the format. The most critical problem is that newlines aren't escaped in the strings. That
makes using tools like grep, sed, and awk very tricky.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message