hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.
Date Thu, 27 Jul 2017 22:00:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103982#comment-16103982

Robert Kanter commented on YARN-6875:

[~xgong], have you taken a look at YARN-2942 and subtasks?  I tried to do something like this
a while ago and we went through a few different designs (I think there are 3 major different
approaches, and some minor revisions for each); one of the approaches was very similar to
your design, where there's an index file.

In the end, we decided to do something completely different (MAPREDUCE-6415) by adding a command
to combine log files into HAR files.  This was to help with the too-many-small-files problem;
though we still kept the T-files, so the goal was slightly different.  

Anyway, I did write a bunch of code for YARN-2942 and some subtasks before we canned it, so
you might want to take a look in case you find something useful in there or the design documents.

> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>                 Key: YARN-6875
>                 URL: https://issues.apache.org/jira/browse/YARN-6875
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
> T-file is the underlying log format for the aggregated logs in YARN. We have seen several
performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large log files.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message