hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
Date Mon, 14 Dec 2015 22:13:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056817#comment-15056817
] 

Li Lu commented on YARN-3816:
-----------------------------

Thanks for the explanations [~djp]! With regard to my question:

bq. There are 3 types of aggregation basis, but only application aggregation has its own entity
type. How do we represent the result entity of the other 2 types?

In TimelineAggregationBasis.java, we defined three types of aggregation basis: app, flow,
and user. If a timeline entity is generated in app based aggregation, it will be assigned
with entity type = YARN_APPLICATION_AGGREGATION, right? So if in offline aggregation I'm generating
flow and user level aggregation data, am I expected to add YARN_FLOW_AGGREGATION and YARN_USER_AGGREGATION
in TimelineEntityType? Just to check out on this so that we're on the same page. 

On the aggregation logic side, I believe there will be a lot of future extensions on top of
this patch. For example, there may be new and interesting types of aggregations. In this JIRA,
maybe it's fine to restrict aggregation types to REPLACE, SUM, and AREA, and then decide the
interface of aggregation service? The offline aggregator (YARN-3817) will use this interface,
but I can always fine tune the internal aggregation logic afterwards. 

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> ----------------------------------------------------------------------------
>
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>              Labels: yarn-2928-1st-milestone
>         Attachments: Application Level Aggregation of Timeline Data.pdf, YARN-3816-YARN-2928-v1.patch,
YARN-3816-YARN-2928-v2.1.patch, YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch,
YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, YARN-3816-YARN-2928-v3.patch,
YARN-3816-YARN-2928-v4.patch, YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch,
YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: resource (CPU,
Memory) consumption across all containers, number of containers launched/completed/failed,
etc. We need this for apps while they are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be aggregated to show
details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based on Application-level
aggregations rather than raw entity-level data as much less raws need to scan (with filter
out non-aggregated entities, like: events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message