hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Condit (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4946) RM should not consider an application as COMPLETED when log aggregation is not in a terminal state
Date Mon, 20 May 2019 23:40:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844394#comment-16844394
] 

Craig Condit commented on YARN-4946:
------------------------------------

Opened YARN-9571 to make this feature configurable. I think with this in place, backporting
should be relatively safe.

> RM should not consider an application as COMPLETED when log aggregation is not in a terminal
state
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4946
>                 URL: https://issues.apache.org/jira/browse/YARN-4946
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: log-aggregation
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Szilard Nemeth
>            Priority: Major
>             Fix For: 3.2.0
>
>         Attachments: YARN-4946.001.patch, YARN-4946.002.patch, YARN-4946.003.patch, YARN-4946.004.patch
>
>
> MAPREDUCE-6415 added a tool that combines the aggregated log files for each Yarn App
into a HAR file.  When run, it seeds the list by looking at the aggregated logs directory,
and then filters out ineligible apps.  One of the criteria involves checking with the RM that
an Application's log aggregation status is not still running and has not failed.  When the
RM "forgets" about an older completed Application (e.g. RM failover, enough time has passed,
etc), the tool won't find the Application in the RM and will just assume that its log aggregation
succeeded, even if it actually failed or is still running.
> We can solve this problem by doing the following:
> The RM should not consider an app to be fully completed (and thus removed from its history)
until the aggregation status has reached a terminal state (e.g. SUCCEEDED, FAILED, TIME_OUT).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message