Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6907B18224 for ; Mon, 29 Feb 2016 18:40:18 +0000 (UTC) Received: (qmail 22638 invoked by uid 500); 29 Feb 2016 18:40:18 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 22590 invoked by uid 500); 29 Feb 2016 18:40:18 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 22572 invoked by uid 99); 29 Feb 2016 18:40:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Feb 2016 18:40:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 1AAF92C1F5C for ; Mon, 29 Feb 2016 18:40:18 +0000 (UTC) Date: Mon, 29 Feb 2016 18:40:18 +0000 (UTC) From: "Vrushali C (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172335#comment-15172335 ] Vrushali C commented on YARN-4700: ---------------------------------- Hi [~sjlee0] Yes, the flow activity table's row key always needs to belong to the top of the day timestamp. But the event timestamp should be used to find out the top of that day. bq. If they meant that we would use the actual event timestamps as is for the row key, I'm not as sure. No, we can't use the event timestamp as is. It needs to be top of the day of that timestamp. Which is what I said in the previous comment, " the entry for that flow should go into THAT older day's row, hence we should use the event timestamp." You are right, the code in FlowActivityRowKey#getRowKey() needs to change to take the event timestamp, not the current time. I thought we were sending in null for the timestamp and hence using current time, but looks like it's directly using current time here. {code} long dayTs = TimelineStorageUtils.getTopOfTheDayTimestamp(System .currentTimeMillis()); {code} > ATS storage has one extra record each time the RM got restarted > --------------------------------------------------------------- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Li Lu > Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one extra record for each finished application (but still hold in the RM state store) each time the RM got restarted. It's quite possible that we add the cluster start timestamp into the default cluster id, thus each time we're creating a new record for one application (cluster id is a part of the row key). We need to fix this behavior, probably by having a better default cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)