hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
Date Tue, 10 Feb 2015 22:37:15 GMT

    [ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315100#comment-14315100

Sangjin Lee commented on YARN-2928:

Not sure I understand clearly as to how the relationship is captured. Consider this case:
There are 5 hive queries: q1 to q5. There are 3 Tez apps: a1 to a3. Now, q1 and q5 ran on
a1, q2 ran on a2 and q3,q4 ran on a3. Given q1, I need to know which app it ran on. Given
a1, I need to know which queries ran on it. Could you clarify how this should be represented
as flows?

Based on that description, this would be the parent-child relationship: a1 --> (q1, q5),
a2 --> (q2), a3 --> (q3, q4). Given q1, its parent is a1. Given a1, a1's children are
q1 and q5. If q1 spawned 3 YARN apps (y1, y2, y3), their parent would be q1. This parent-child
relationship would be encoded in the data model.

The only case where this would break is if the same entity needs more than one parent at the
YARN level (flow runs, YARN apps, etc.). Note that we're talking about flow *runs*, not flows.
The same flow may have multiple actual runs. The parent-child relationship is at the flow
runs. Let me know if this helps.

Please explain what "globally" means.

What I'm envisioning is a boolean configuration that can disable the timeline service altogether,
not unlike the current switch on the ATS. If this configuration is enabled, no timeline data
would be written, no daemon would be started, etc.

> Application Timeline Server (ATS) next gen: phase 1
> ---------------------------------------------------
>                 Key: YARN-2928
>                 URL: https://issues.apache.org/jira/browse/YARN-2928
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>         Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal v1.pdf
> We have the application timeline server implemented in yarn per YARN-1530 and YARN-321.
Although it is a great feature, we have recognized several critical issues and features that
need to be addressed.
> This JIRA proposes the design and implementation changes to address those. This is phase
1 of this effort.

This message was sent by Atlassian JIRA

View raw message