hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
Date Thu, 26 Mar 2015 06:32:54 GMT

    [ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381453#comment-14381453
] 

Naganarasimha G R commented on YARN-3044:
-----------------------------------------

Thanks [~vinodkv],[~vrushalic], [~sjlee0] & [~zjshen] for reviewing and providing your
view points :
1> {{"source of life-cycle events of container"}} is a debatable topic, to summarize pro's
and cons when run in NM:

Pros 
* Even though the load is not too high when compared to publishing of container metrics, life
cycle events might have considerable load for a large cluster as explained by [~sjlee0]. So
i feel better to get it distributed in this aspect
* if start and end time of life cycle events are logged from NM it will be easier to analyze
flow of container as it is actual time when it was started
* IMO it would be good to have all the metrics and events are raised from NM itself as there
might be a possibility of race condition if container entities are raised from RM and metrics
and few other life cycle events from NM for ex. when RM is slow to dispatch the events and
NM is faster in doing it. (though hbase as storage will be able to handle it well but not
sure about the other storages we are planning to )
  
Cons
* start and end time of life cycle events might not match from what is displayed from RM (web
ui etc..) 
* start and end time of life cycle events in terms of scheduling it might not be as accurate
as it would have been done from RM.
Please correct me on these and add on if i have missed any.

2> ??But the life-cycle events of container should definitely originate at the RM; NMs
don't even know many of them.??
Not much aware on this, can you please eloborate on what might be missed ?

3> ??Why would that be the case? Can the RM timeline collector not use specific subclasses
of TimelineEntity??
Well its not the limitation at RM timeline collector which i am trying to mention, but the
writer interface is like
{{TimelineWriter.write(TimelineEntities)}}
Writer would not be aware whether client is writing ApplicationEntity or AppAttemptEntity.IIUC
it will just try to write 
the fields of the TimelineEntity to the storage. May be if its just storing entity as an json
object directly to storage it might not be an issue but it will not be the case in hbase column
storage right ?

4> ??My suggestion is that we start with reimplementing what we provided in YTS v1, and
add more timeline data on demand later??
true that to start of with this would be sufficent, but in future i would liked to capture
all the events as currently to analyze/debug issues with container we usually start searching
the NM and RM logs with container string to find what state the application/container is in.
ur opinion ?

> [Event producers] Implement RM writing app lifecycle events to ATS
> ------------------------------------------------------------------
>
>                 Key: YARN-3044
>                 URL: https://issues.apache.org/jira/browse/YARN-3044
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3044.20150325-1.patch
>
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message