hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
Date Wed, 13 Apr 2016 01:57:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238407#comment-15238407

Sangjin Lee commented on YARN-3816:

Thanks [~gtCarrera9] for the quick update!

As for the new metric type (i.e. base type + "_" + contributing child entity type), I do see
the rationale (or need) to distinguish aggregation coming from different entities. We should
still note that the metric would show somewhat awkwardly if we read the applications via queries.
Aggregated metrics would look like "MEMORY_YARN_CONTAINER" for example. I'm not quite sure
if there would be additional issues.

Also, I think we should be real judicious in permitting the aggregation. The most important
case should be YARN container-to-app. For per-framework metrics, AMs themselves should handle
internal aggregations themselves and simply add to the application, as they usually have the
app-level metrics already anyway. That should be the main way to support them.

- l.244: “accumulated” -> “aggregated”?

- l.126: typo: “teal-time” -> “real-time"

- l.83, 87: since these methods expose internals of the {{TimelineCollector}} class, I would
make them {{protected}} to ensure only subclasses can use them
- l. 171: I could suggest one more optimization in terms of memory footprint. If the given
entity does not have metrics, then we can/should skip the entire aggregation status step.
- l.230: It should be {{putIfAbsent()}}. Otherwise, {{put()}} would simply overwrite the value
even if the value exists, and it will result in an incorrect object being used.

- l.214: per comments on the JIRA, this new {{store()}} method should be removed, right?

I would encourage others to take a closer look at this too. Thanks!

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Li Lu
>              Labels: yarn-2928-1st-milestone
>         Attachments: Application Level Aggregation of Timeline Data.pdf, YARN-3816-YARN-2928-v1.patch,
YARN-3816-YARN-2928-v2.1.patch, YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch,
YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, YARN-3816-YARN-2928-v3.patch,
YARN-3816-YARN-2928-v4.patch, YARN-3816-YARN-2928-v5.patch, YARN-3816-YARN-2928-v6.patch,
YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: resource (CPU,
Memory) consumption across all containers, number of containers launched/completed/failed,
etc. We need this for apps while they are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be aggregated to show
details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based on Application-level
aggregations rather than raw entity-level data as much less raws need to scan (with filter
out non-aggregated entities, like: events, configurations, etc.).

This message was sent by Atlassian JIRA

View raw message