hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3981) support timeline clients not associated with an application
Date Mon, 27 Jul 2015 16:47:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643006#comment-14643006

Sangjin Lee commented on YARN-3981:

Some of us had an offline discussion on this. There are some major challenges in supporting
this in the v.2 design. First, obviously they may lack an application-specific context as
they can span multiple YARN apps. Second, even if we solved the problem of the context, these
clients are likely off-cluster, and they need a way to write to the cluster. Ideas such as
a separate dedicated timeline writer just for these have been discussed, but their scalability
is problematic at best.

 One idea that was suggested involves creating a specialized YARN application that can act
as a proxy for these off-cluster clients. For example, suppose you started a tez client that
can start multiple YARN apps. It can also start a special dedicated "(flow-level) timeline
client". This client would launch a special YARN app under the covers whose app master and
its associated timeline writer can serve as the proxy for timeline data the client may write.
When this special timeline client shuts down, it would tear down the associated YARN app also.

If we go this route, we would write the YARN app itself so that the app master listens on
requests coming from the client and proxies it to the timeline writer. We would also write
the timeline client piece so that it manages the YARN app as well as sending the write requests
to the app master.

> support timeline clients not associated with an application
> -----------------------------------------------------------
>                 Key: YARN-3981
>                 URL: https://issues.apache.org/jira/browse/YARN-3981
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Sangjin Lee
> In the current v.2 design, all timeline writes must belong in a flow/application context
(cluster + user + flow + flow run + application).
> But there are use cases that require writing data outside the context of an application.
One such example is a higher level client (e.g. tez client or hive/oozie/cascading client)
writing flow-level data that spans multiple applications. We need to find a way to support

This message was sent by Atlassian JIRA

View raw message