hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2583) Modify the LogDeletionService to support Log aggregation for LRS
Date Wed, 01 Oct 2014 22:09:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155643#comment-14155643

Zhijie Shen commented on YARN-2583:

Per discussion offline:

1. In AggregatedLogDeletionService of JHS, we delete the log files of completed app, and in
AppLogAggregatorImpl of NM, we delete the log files of the running LRS. We need to add a test
case to verify AggregatedLogDeletionService won't delete the running LRS logs. 

2. We apply the same retention policy at both sides, using the time to determine what log
files need to be deleted.

3. For scalability consideration, let's keep the criteria of the number of logs per app, in
case the rolling interval is small and too many configuration files are generated. But let's
keep the config private to AppLogAggregatorImpl.

> Modify the LogDeletionService to support Log aggregation for LRS
> ----------------------------------------------------------------
>                 Key: YARN-2583
>                 URL: https://issues.apache.org/jira/browse/YARN-2583
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-2583.1.patch
> Currently, AggregatedLogDeletionService will delete old logs from HDFS. It will check
the cut-off-time, if all logs for this application is older than this cut-off-time. The app-log-dir
from HDFS will be deleted. This will not work for LRS. We expect a LRS application can keep
running for a long time. 
> Two different scenarios: 
> 1) If we configured the rollingIntervalSeconds, the new log file will be always uploaded
to HDFS. The number of log files for this application will become larger and larger. And there
is no log files will be deleted.
> 2) If we did not configure the rollingIntervalSeconds, the log file can only be uploaded
to HDFS after the application is finished. It is very possible that the logs are uploaded
after the cut-off-time. It will cause problem because at that time the app-log-dir for this
application in HDFS has been deleted.

This message was sent by Atlassian JIRA

View raw message