hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3816) [Aggregation] App-level Aggregation for YARN system metrics
Date Mon, 27 Jul 2015 21:03:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643388#comment-14643388

Li Lu commented on YARN-3816:

bq. I'm still very confused by the usage of the word "aggregate". In this patch, "aggregate"
really means accumulating values of a metric along the time dimension, which is completely
different than the notion of aggregation we have used all along. The aggregation has always
been about rolling up values from children to parents.

I have a similar concern with regard to the dimensions of "aggregations", too. If I understand
the problem correctly, we have two dimensions in a flow/user level aggregation: one dimension
for all entities belong to this flow/user, another dimension for time. If we aggregate in
the flow/user dimension, one typical problem we will hit is aligning times. Suppose entity
E1 and E2 both belong to flow F1. In an aggregation, we would like to aggregate E1 and E2.
However, if a metric M is a time series, how do we align the times in E1.M and E2.M? Normally
the two time series may have slightly different sample times, so I believe we need to decide
the semantic on this situation? 

> [Aggregation] App-level Aggregation for YARN system metrics
> -----------------------------------------------------------
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: Application Level Aggregation of Timeline Data.pdf, YARN-3816-poc-v1.patch,
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: resource (CPU,
Memory) consumption across all containers, number of containers launched/completed/failed,
etc. We need this for apps while they are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be aggregated to show
details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based on Application-level
aggregations rather than raw entity-level data as much less raws need to scan (with filter
out non-aggregated entities, like: events, configurations, etc.).

This message was sent by Atlassian JIRA

View raw message