hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4392) ApplicationCreatedEvent event time resets after RM restart/failover
Date Wed, 02 Dec 2015 20:51:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036580#comment-15036580
] 

Xuan Gong commented on YARN-4392:
---------------------------------

[~Naganarasimha]

bq. there is no limit on number of running apps in state store and finished apps are restricted
to a configurable number. In such cases would not there be many created events in a larger
cluster on recovery?

This is a good point given the performance of ATS v1 is not that scalable. 

Will it cause any issue if the APP_CREATED event is missing ? If that only cause the missing
related information in ATS webui/webservice, I am OK with not re-sending the ATS events on
recovery. 

[~jlowe] What is your opinion ?

> ApplicationCreatedEvent event time resets after RM restart/failover
> -------------------------------------------------------------------
>
>                 Key: YARN-4392
>                 URL: https://issues.apache.org/jira/browse/YARN-4392
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Xuan Gong
>            Assignee: Naganarasimha G R
>            Priority: Critical
>         Attachments: YARN-4392-2015-11-24.patch, YARN-4392.1.patch, YARN-4392.2.patch
>
>
> {code}2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time
1437453994768 is ahead of started time 1440308399674 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437454008244
is ahead of started time 1440308399676 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444305171
is ahead of started time 1440308399653 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444293115
is ahead of started time 1440308399647 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444379645
is ahead of started time 1440308399656 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444361234
is ahead of started time 1440308399655 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444342029
is ahead of started time 1440308399654 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444323447
is ahead of started time 1440308399654 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444430006
is ahead of started time 1440308399660 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444415698
is ahead of started time 1440308399659 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444419060
is ahead of started time 1440308399658 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444393931
is ahead of started time 1440308399657
> {code} . 
> From ATS logs, we would see a large amount of 'stale alerts' messages periodically



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message