hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.
Date Wed, 02 Aug 2017 22:56:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111886#comment-16111886

Wangda Tan commented on YARN-6875:

[~jlowe], [~xgong],
I'm thinking this issue, probably we can create a local index file instead of remote index
file to void extra overload to NN.

Do you think if following solution is reasonable:
- Local log aggregator always maintain a separate confirmed index file on *local dir* 
- When we need to do partial log aggregation, we always read the local index file, and replace
it once partial log aggregation finishes. 
- For the under-appending file, we will try to load local index file. (I think this is possible).
- If appending fails, and NM will retry, we will follow the same logic above. 
- If appending fails, and NM is alive and will not retry, it will append index file to the
remote file. 
- If appending fails, and NM is not alive, it follow Jason's logic to scan where's the last
index. This should be rare.

Hope to hear your thoughts.

> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>                 Key: YARN-6875
>                 URL: https://issues.apache.org/jira/browse/YARN-6875
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
> T-file is the underlying log format for the aggregated logs in YARN. We have seen several
performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large log files.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message