Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 98F5F18D41 for ; Wed, 2 Dec 2015 20:51:11 +0000 (UTC) Received: (qmail 93131 invoked by uid 500); 2 Dec 2015 20:51:11 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 93060 invoked by uid 500); 2 Dec 2015 20:51:11 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 93034 invoked by uid 99); 2 Dec 2015 20:51:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Dec 2015 20:51:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 1664B2C1F72 for ; Wed, 2 Dec 2015 20:51:11 +0000 (UTC) Date: Wed, 2 Dec 2015 20:51:11 +0000 (UTC) From: "Xuan Gong (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4392) ApplicationCreatedEvent event time resets after RM restart/failover MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036580#comment-15036580 ] Xuan Gong commented on YARN-4392: --------------------------------- [~Naganarasimha] bq. there is no limit on number of running apps in state store and finished apps are restricted to a configurable number. In such cases would not there be many created events in a larger cluster on recovery? This is a good point given the performance of ATS v1 is not that scalable. Will it cause any issue if the APP_CREATED event is missing ? If that only cause the missing related information in ATS webui/webservice, I am OK with not re-sending the ATS events on recovery. [~jlowe] What is your opinion ? > ApplicationCreatedEvent event time resets after RM restart/failover > ------------------------------------------------------------------- > > Key: YARN-4392 > URL: https://issues.apache.org/jira/browse/YARN-4392 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.8.0 > Reporter: Xuan Gong > Assignee: Naganarasimha G R > Priority: Critical > Attachments: YARN-4392-2015-11-24.patch, YARN-4392.1.patch, YARN-4392.2.patch > > > {code}2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437453994768 is ahead of started time 1440308399674 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437454008244 is ahead of started time 1440308399676 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444305171 is ahead of started time 1440308399653 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444293115 is ahead of started time 1440308399647 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444379645 is ahead of started time 1440308399656 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444361234 is ahead of started time 1440308399655 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444342029 is ahead of started time 1440308399654 > 2015-09-01 12:39:09,852 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444323447 is ahead of started time 1440308399654 > 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444430006 is ahead of started time 1440308399660 > 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444415698 is ahead of started time 1440308399659 > 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444419060 is ahead of started time 1440308399658 > 2015-09-01 12:39:09,853 WARN util.Times (Times.java:elapsed(53)) - Finished time 1437444393931 is ahead of started time 1440308399657 > {code} . > From ATS logs, we would see a large amount of 'stale alerts' messages periodically -- This message was sent by Atlassian JIRA (v6.3.4#6332)