hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations
Date Tue, 23 Jun 2015 23:19:43 GMT

    [ https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598556#comment-14598556

Sangjin Lee commented on YARN-3815:

AM currently leverage YARN's AppTimelineCollector to forward entities to backend storage,
so making AM talk directly to backend storage is not considered to be safe.

Just to be clear, I'm *not* proposing AMs writing directly to the backend storage. AMs continue
to write through the app-level timeline collector. My proposal is that the AMs are responsible
for setting the aggregated framework-specific metric values on the *YARN application entities*.

Let's consider the example of MR. MR itself would have its own entities such as job, tasks,
and task attempts. These are distinct entities from the YARN entities such as application,
app attempts, and containers. We can either (1) have the MR AM set framework-specific metric
values at the YARN container entities and have YARN aggregate them to applications, or (2)
have the MR AM set the aggregated values on the applications for itself.

I feel the latter approach is conceptually cleaner. The framework is ultimately responsible
for its metrics (YARN doesn't even know what metrics there are). We could decide that YARN
would look at the framework-specific metrics at the app level and aggregate them from the
app level onward to flows, user, and queue.

In addition, most frameworks already have an aggregated view of the metrics. It would be very
straightforward to emit them at the app level.

In summary, option (1) asks the framework to write metrics on its own entities (job, tasks,
task attempts) plus YARN container entities. Option (2) asks the framework to write metrics
on its own entities (job, tasks, task attempts) plus YARN app entities. IMO, the latter is
a more reliable approach. We can discuss this further...

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> ------------------------------------------------------------
>                 Key: YARN-3815
>                 URL: https://issues.apache.org/jira/browse/YARN-3815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: Timeline Service Nextgen Flow, User, Queue Level Aggregations (v1).pdf
> Per previous discussions in some design documents for YARN-2928, the basic scenario is
the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version and flow 
> - User level, expect return: aggregated stats for applications submitted by user
> - Queue level, expect return: aggregated stats for applications within the Queue
> Application states is the basic building block for all other level aggregations. We can
provide Flow/User/Queue level aggregated statistics info based on application states (a dedicated
table for application states is needed which is missing from previous design documents like
HBase/Phoenix schema design). 

This message was sent by Atlassian JIRA

View raw message