hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
Date Tue, 18 Oct 2016 00:31:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583958#comment-15583958

Li Lu commented on YARN-3816:

Hi [~varun_saxena], please see my comments inline...
bq. We do not aggregate the entities reported since last aggregation run when app collector
finishes. Is this intentional ? We however would miss only the last set of metrics which should
be fine.
That's not intentional... I remember this bug and I have the impression that I once worked
on a fix but seems like there is no JIRA to trace this work. I'll open a JIRA and trace the
bq. We also have aggregation interval fixed at 15 sec. Has it not been made configurable due
to concerns with somebody setting it too low or too high ?
Having a system wide configuration may not be enough since app running times vary a lot. So
you're right that for now we're assuming the 15 secs interval to avoid misconfigurations.
At the same time, we may want to explore different ways to allow applications set their own
bq. Would it be better to use time weighted average for aggregated metrics.
I agree it is helpful. However, I believe this is slightly different to the "aggregation"
we talk about here. As Sangjin mentioned before, "aggregation" in this JIRA mainly means applying
an aggregation method to all *subparts'* metrics to get the parent's metric, like aggregating
CPU usage for all containers to get the CPU usage of the whole app attempt. 

What you've mention here, IIUC, is something closer to the concept "accumulation" as we discussed
before. Accumulation will apply an accumulative method on the same metric for the same timeline
entity *across time*. We have not yet started the work of accumulation, but my feeling is
we can make it work together with the aggregation framework without much changes to the code

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Li Lu
>              Labels: yarn-2928-1st-milestone
>             Fix For: 3.0.0-alpha1
>         Attachments: Application Level Aggregation of Timeline Data.pdf, YARN-3816-YARN-2928-v1.patch,
YARN-3816-YARN-2928-v2.1.patch, YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch,
YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, YARN-3816-YARN-2928-v3.patch,
YARN-3816-YARN-2928-v4.patch, YARN-3816-YARN-2928-v5.patch, YARN-3816-YARN-2928-v6.patch,
YARN-3816-YARN-2928-v7.patch, YARN-3816-YARN-2928-v8.patch, YARN-3816-YARN-2928-v9.patch,
YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: resource (CPU,
Memory) consumption across all containers, number of containers launched/completed/failed,
etc. We need this for apps while they are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be aggregated to show
details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based on Application-level
aggregations rather than raw entity-level data as much less raws need to scan (with filter
out non-aggregated entities, like: events, configurations, etc.).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message