Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E198C17CF7 for ; Thu, 26 Mar 2015 06:32:54 +0000 (UTC) Received: (qmail 74094 invoked by uid 500); 26 Mar 2015 06:32:54 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 74051 invoked by uid 500); 26 Mar 2015 06:32:54 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 74039 invoked by uid 99); 26 Mar 2015 06:32:54 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2015 06:32:54 +0000 Date: Thu, 26 Mar 2015 06:32:54 +0000 (UTC) From: "Naganarasimha G R (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381453#comment-14381453 ] Naganarasimha G R commented on YARN-3044: ----------------------------------------- Thanks [~vinodkv],[~vrushalic], [~sjlee0] & [~zjshen] for reviewing and providing your view points : 1> {{"source of life-cycle events of container"}} is a debatable topic, to summarize pro's and cons when run in NM: Pros * Even though the load is not too high when compared to publishing of container metrics, life cycle events might have considerable load for a large cluster as explained by [~sjlee0]. So i feel better to get it distributed in this aspect * if start and end time of life cycle events are logged from NM it will be easier to analyze flow of container as it is actual time when it was started * IMO it would be good to have all the metrics and events are raised from NM itself as there might be a possibility of race condition if container entities are raised from RM and metrics and few other life cycle events from NM for ex. when RM is slow to dispatch the events and NM is faster in doing it. (though hbase as storage will be able to handle it well but not sure about the other storages we are planning to ) Cons * start and end time of life cycle events might not match from what is displayed from RM (web ui etc..) * start and end time of life cycle events in terms of scheduling it might not be as accurate as it would have been done from RM. Please correct me on these and add on if i have missed any. 2> ??But the life-cycle events of container should definitely originate at the RM; NMs don't even know many of them.?? Not much aware on this, can you please eloborate on what might be missed ? 3> ??Why would that be the case? Can the RM timeline collector not use specific subclasses of TimelineEntity?? Well its not the limitation at RM timeline collector which i am trying to mention, but the writer interface is like {{TimelineWriter.write(TimelineEntities)}} Writer would not be aware whether client is writing ApplicationEntity or AppAttemptEntity.IIUC it will just try to write the fields of the TimelineEntity to the storage. May be if its just storing entity as an json object directly to storage it might not be an issue but it will not be the case in hbase column storage right ? 4> ??My suggestion is that we start with reimplementing what we provided in YTS v1, and add more timeline data on demand later?? true that to start of with this would be sufficent, but in future i would liked to capture all the events as currently to analyze/debug issues with container we usually start searching the NM and RM logs with container string to find what state the application/container is in. ur opinion ? > [Event producers] Implement RM writing app lifecycle events to ATS > ------------------------------------------------------------------ > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Naganarasimha G R > Attachments: YARN-3044.20150325-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)