hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
Date Wed, 14 Jan 2015 01:45:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276358#comment-14276358

Sangjin Lee commented on YARN-2928:

Regarding the per-node approach, I do have some questions (and observations) on the approach
in addition to the aspect of losing the isolation/attribution as already discussed.

While it may be faster to allocate with the per-node companions, capacity-wise you would end
up spending more capacity with the per-node approach. Since these per-node companions are
always up although they may be idle for large amount of time. So if capacity is a concern
you may lose out. Under what circumstances would per-node companions be more advantageous
in terms of capacity?

I do have a question about the work-preserving aspect of the per-node ATS companion. One implication
of making this a per-node thing (i.e. long-running) is that we need to handle the work-preserving
restart. What if we need to restart the ATS companion? Since other YARN daemons (RM and NM)
allow for work-preserving restarts, we cannot have the ATS companion break that. So that seems
to be a requirement?

We still need to handle the lifecycle management aspects of it. Previously we said that when
RM allocates an AM it would tell the NM so the NM could spawn the special container. With
the per-node approach, the RM would *still* need to tell the NM so that the NM can talk to
the per-node ATS companion to initialize the data structure for the given app.

These are quick observations. While I do see value in the per-node approach, it's not totally
clear how much work it would save over the per-app approach given these observations. What
do you think?

> Application Timeline Server (ATS) next gen: phase 1
> ---------------------------------------------------
>                 Key: YARN-2928
>                 URL: https://issues.apache.org/jira/browse/YARN-2928
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>         Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf
> We have the application timeline server implemented in yarn per YARN-1530 and YARN-321.
Although it is a great feature, we have recognized several critical issues and features that
need to be addressed.
> This JIRA proposes the design and implementation changes to address those. This is phase
1 of this effort.

This message was sent by Atlassian JIRA

View raw message