hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
Date Thu, 06 Aug 2015 19:16:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660638#comment-14660638
] 

Naganarasimha G R commented on YARN-3045:
-----------------------------------------

Hi All,
bq. I appreciate Naganarasimha G R's patient on this JIRA's work and I am sure the latest
patch (08) is getting much closer. 
Thanks for the support will try to get it closure as early as possible, [~djp] one more favor
can you join us in slack ? faster to communicate further.
bq. Sorry but it's not clear what the 2 options are. Could you kindly rephrase the options?
Sorry for being cryptic here. what i meant was whether its sufficient to capture localization
events once @ container level(Localization was successful or failed). Or is it required to
capture for each {{LocalizedResource}} required by Container which is more detailed to analyze
if any particular resource is taking time.
For the later we need to use Events on the LocalizedResource state machine (i.e. ResourceEventType.REQUEST,
LOCALIZED & LOCALIZATION_FAILED)
And for the former we can either use {{ResourceLocalizationService}} events (, i.e. LocalizationEventType.INIT_CONTAINER_RESOURCES
& CONTAINER_RESOURCES_LOCALIZED) or Events on the {{ContainerImpl state machine}} (i.e.
ContainerEventType.RESOURCE_LOCALIZED & RESOURCE_FAILED). Advantage of using ResourceLocalizationService
events is it has precise time of start of localization and end of localization, but in case
ContainerImpl (ContainerEventType) we will get it approximately by calculating the difference
between the timestamps when INIT_CONTAINER & RESOURCE_LOCALIZED events are published.
Among these options my opinion was to use ContainerImpl StateMachine events.

bq.  Naga means : 1. make resource localization events wrap as application entity; 2. make
resource localization events wrap as NM entity (in case we add it in future).
As explained in prev comment my intention was different, but wanted to to place the localization
events along with ContainerEntity. Is that fine ?

bq. Also, even for some duplicated event, like APPLICATION_CONTAINER_FINISHED in ApplicationEvent,
ContainerEvent could provide more details: CONTAINER_EXITED_WITH_SUCCESS, CONTAINER_EXITED_WITH_FAILURE,
CONTAINER_KILLED_ON_REQUEST.
Even though these ContainerEvents give details state of container but its state machine will
be intermediate state (will not be DONE state) when these events are being processed (exit
code, final state, diagnostic msg etc might not be filled in). So in the current patch i have
published Containerfinished ATS event on  APPLICATION_CONTAINER_FINISHED event rather than
other container events, wanted to check further with you [~djp], but anyway you only raised
this topic. i feel it would be better to also capture CONTAINER_EXITED_WITH_FAILURE, CONTAINER_KILLED_ON_REQUEST
among ContainerEvents. thoughts ?

bq. second, for unrecognized event, we should log a warn message (at least debug message)
instead of do nothing.
Its basically not unrecognized event but its event which is not of our interest, so better
i can just ignore default case and delete it. ok ?

For other comments and issues reported jenkins will get it corrected as part of next patch.

> [Event producers] Implement NM writing container lifecycle events to ATS
> ------------------------------------------------------------------------
>
>                 Key: YARN-3045
>                 URL: https://issues.apache.org/jira/browse/YARN-3045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch,
YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045-YARN-2928.007.patch,
YARN-3045-YARN-2928.008.patch, YARN-3045.20150420-1.patch
>
>
> Per design in YARN-2928, implement NM writing container lifecycle events and container
system metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message