hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations
Date Thu, 02 Jul 2015 23:59:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612687#comment-14612687

Junping Du commented on YARN-3815:

Thanks [~sjlee0] for comments!
bq. I think it is pretty natural and straightforward for AMs to aggregate and retain values
at the app level, but even if they set it at the container level, it could work.
I would rather say it is "natural" before timeline service v2 comes out. :) We don't have
to make it at container level I think but also not necessary for AM to retain and aggregate
these values. AM could help to forward the values to per app timeline collector but don't
have to aggregate them. Vinod got more ideas on this in offline discussion. [~vinodkv], can
you comment on this?

bq. Note that we're not proposing to keep the average as a time series. So I'm not sure if
that is feasible.
If not, we may consider to change the proposal to support time series given the data is not
too much here.

bq. We also ruled out per-container averages (explained in the summary), so per-task resource
usage is not an example we're looking for.
I think "per-container averages" is not equal to per-container resource usage. Understanding
application's real resource consumption/usage is one of the core use cases for new timeline
service at the beginning so I don't think we should rule out anything important here.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> ------------------------------------------------------------
>                 Key: YARN-3815
>                 URL: https://issues.apache.org/jira/browse/YARN-3815
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>         Attachments: Timeline Service Nextgen Flow, User, Queue Level Aggregations (v1).pdf,
aggregation-design-discussion.pdf, hbase-schema-proposal-for-aggregation.pdf
> Per previous discussions in some design documents for YARN-2928, the basic scenario is
the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version and flow 
> - User level, expect return: aggregated stats for applications submitted by user
> - Queue level, expect return: aggregated stats for applications within the Queue
> Application states is the basic building block for all other level aggregations. We can
provide Flow/User/Queue level aggregated statistics info based on application states (a dedicated
table for application states is needed which is missing from previous design documents like
HBase/Phoenix schema design). 

This message was sent by Atlassian JIRA

View raw message