hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-2583) Modify the LogDeletionService to support Log aggregation for LRS
Date Tue, 07 Oct 2014 07:26:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161547#comment-14161547
] 

Zhijie Shen edited comment on YARN-2583 at 10/7/14 7:26 AM:
------------------------------------------------------------

[~xgong], thanks for the patch. I'm fine withe approach. Here're some comments:

1. Be more specific: we find a more scalable method to only write a single log file per LRS?
{code}
  // we find a more scalable method.
{code}

2. Make 30  and 3600 the constants of AppLogAggregatorImpl?
{code}
    int configuredRentionSize =
        conf.getInt(NM_LOG_AGGREGATION_RETAIN_RETENTION_SIZE_PER_APP, 30);
{code}
{code}
    if (configuredInterval > 0 && configuredInterval < 3600) {
{code}

3. Should be ">"?
{code}
      if (status.size() >= this.retentionSize) {
{code}
And should be "<"?
{code}
        for (int i = 0 ; i <= statusList.size() - this.retentionSize; i++) {
{code}

4. why not using yarnclient? The packaging issue?
{code}
  private ApplicationClientProtocol rmClient;
{code}

5. The existing annotation is not correct. At least, the module has been reused by MR.
{code}
@Private
public class AggregatedLogDeletionService extends AbstractService {
{code}

6. Instead of make log deletion task non-static, why not passing rmClient into the constructor
of this class

7. Shall we spin off the LogRollingInterval related change in another Jira?


was (Author: zjshen):
1. Be more specific: we find a more scalable method to only write a single log file per LRS?
{code}
  // we find a more scalable method.
{code}

2. Make 30  and 3600 the constants of AppLogAggregatorImpl?
{code}
    int configuredRentionSize =
        conf.getInt(NM_LOG_AGGREGATION_RETAIN_RETENTION_SIZE_PER_APP, 30);
{code}
{code}
    if (configuredInterval > 0 && configuredInterval < 3600) {
{code}

3. Should be ">"?
{code}
      if (status.size() >= this.retentionSize) {
{code}
And should be "<"?
{code}
        for (int i = 0 ; i <= statusList.size() - this.retentionSize; i++) {
{code}

4. why not using yarnclient? The packaging issue?
{code}
  private ApplicationClientProtocol rmClient;
{code}

> Modify the LogDeletionService to support Log aggregation for LRS
> ----------------------------------------------------------------
>
>                 Key: YARN-2583
>                 URL: https://issues.apache.org/jira/browse/YARN-2583
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-2583.1.patch, YARN-2583.2.patch, YARN-2583.3.1.patch, YARN-2583.3.patch
>
>
> Currently, AggregatedLogDeletionService will delete old logs from HDFS. It will check
the cut-off-time, if all logs for this application is older than this cut-off-time. The app-log-dir
from HDFS will be deleted. This will not work for LRS. We expect a LRS application can keep
running for a long time. 
> Two different scenarios: 
> 1) If we configured the rollingIntervalSeconds, the new log file will be always uploaded
to HDFS. The number of log files for this application will become larger and larger. And there
is no log files will be deleted.
> 2) If we did not configure the rollingIntervalSeconds, the log file can only be uploaded
to HDFS after the application is finished. It is very possible that the logs are uploaded
after the cut-off-time. It will cause problem because at that time the app-log-dir for this
application in HDFS has been deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message