hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4392) ApplicationCreatedEvent event time resets after RM restart/failover
Date Wed, 25 Nov 2015 00:40:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025834#comment-15025834
] 

Xuan Gong commented on YARN-4392:
---------------------------------

Created two patch to fix this issue:
1) the patch with timestamp: when ATS generates the Application create_time, it would read
ApplicationMetricsConstants.SUBMITTED_TIME_ENTITY_INFO instead of timelineevent timestamp

2) the patch without timestamp: when create RMAppImpl object, we would use startTime as an
input. If this is the new Application, the startTime would be set as currentTimeStamp. If
it is the recovered application, the startTime would be set from appState. By doing this,
we could also get the consistent application start time from both RM Web ui and ATS ui.

Personally, I prefer the option 2.

[~jlowe], [~Naganarasimha], [~jeagles] what does you think ?

> ApplicationCreatedEvent event time resets after RM restart/failover
> -------------------------------------------------------------------
>
>                 Key: YARN-4392
>                 URL: https://issues.apache.org/jira/browse/YARN-4392
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>            Priority: Critical
>         Attachments: YARN-4392-2015-11-24.patch, YARN-4392.1.patch
>
>
> {code}2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time
1437453994768 is ahead of started time 1440308399674 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437454008244
is ahead of started time 1440308399676 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444305171
is ahead of started time 1440308399653 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444293115
is ahead of started time 1440308399647 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444379645
is ahead of started time 1440308399656 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444361234
is ahead of started time 1440308399655 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444342029
is ahead of started time 1440308399654 
> 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444323447
is ahead of started time 1440308399654 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444430006
is ahead of started time 1440308399660 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444415698
is ahead of started time 1440308399659 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444419060
is ahead of started time 1440308399658 
> 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444393931
is ahead of started time 1440308399657
> {code} . 
> From ATS logs, we would see a large amount of 'stale alerts' messages periodically



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message