hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
Date Tue, 04 Aug 2015 18:46:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654132#comment-14654132
] 

Sangjin Lee commented on YARN-3045:
-----------------------------------

Thanks for your input [~djp]! Just wanted to clarify a few things.

{quote}
Sorry. I wasn't at that meeting. What's the concern to have NodeManagerEntity? Without this,
how could we store something like NM's configuration?
{quote}

Naga is referring to the Wednesday status call. What I said is that we do not need a separate
entity to handle *application*-related events coming out of node managers. If these events
are attributes of applications, then they should be on the application entities. If I want
to find out all events for some application, then I should be able to query only the application
entity and get all events.

The need to have NodeManagerEntity is something different IMO. Note that today there are challenges
in emitting data without any application context (e.g. node manager's configuration) as we
discussed a few times. If we need to support that, that needs a different discussion.

{quote}
This is true today. However, it may not be precisely for all cases/scenarios. Some implementation
of TimelineWriter, like: FS, may only have sync semantics for write(), and flush() could do
nothing.
{quote}

That's correct. What I meant was in general the *contract* of write() may not provide a guarantee
that the data will be written completely synchronously. For FS, yes, it will sync. Thus the
operative word "may". :)

{quote}
Do we need to differentiate synchronous with critical in put operation from TimelineClient
prospective? Sync most likely mean the client logic rely on the return result of the put call
and async put just mean we call put in a non-blocking way. Critical and non-critical for messages(entities)
is a relative concept and could be various under different system configurations. Thus, I
won't be surprised if we put some critical entities in async way as very rare case we do need
sync put in client. 
Actually, I was convinced in YARN-3949 (https://issues.apache.org/jira/browse/YARN-3949?focusedCommentId=14640910&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14640910)
that collector level knows better than writer to decide if it should flush. I would also like
to claim that collector could also know better than client on the boundary between critical
and non-critical due to the knowledge on system configuration, e.g. less types of entities
should be counted as critical for a large scale cluster but client has no knowledge about
it. If collector has no add-on knowledge against client, it could be simpler to pass down
sync/async() from client to sync/async in writer. Isn't it?
{quote}

Hmm, my assumption was that the sync/async distinction from the client perspective mapped
to whether the writer may be flushed or not. If not, then we need to support a 2x2 matrix
of possibilities: sync put w/ flush, sync put w/o flush, async put w/ flush, and async put
w/o flush. I thought it would be a simplifying assumption to align those dimensions.

My main point in YARN-3949 is that it is sufficient for the writer to provide write() and
flush(). The timeline collector can then support all possible semantics, even including the
2x2 matrix behavior if needed.

> [Event producers] Implement NM writing container lifecycle events to ATS
> ------------------------------------------------------------------------
>
>                 Key: YARN-3045
>                 URL: https://issues.apache.org/jira/browse/YARN-3045
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Naganarasimha G R
>         Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch,
YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045.20150420-1.patch
>
>
> Per design in YARN-2928, implement NM writing container lifecycle events and container
system metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message