hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vrushali C (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3816) [Aggregation] App-level Aggregation for YARN system metrics
Date Mon, 27 Jul 2015 21:45:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643463#comment-14643463
] 

Vrushali C commented on YARN-3816:
----------------------------------


bq. If I understand the problem correctly, we have two dimensions in a flow/user level aggregation:
one dimension for all entities belong to this flow/user, another dimension for time.

Ah not quite. Time dimension goes with flow/user/queue. For example, we will aggregate for
user level stats over a time period like daily or weekly. Similarly for flows. Flows are aggregated
over one day or one week in hRaven. Ditto for users and queues. So let's say, for simplicity,
user1 ran a wordcount map reduce job three times on Monday and a sleep job two times on monday.
Now daily aggregation table for user1 will have sum of each metric which is a counter on that
day, that is

{code} 
M1 for user1 on monday = M1 from wordcount.Run1 on monday + M1 from wordcount.Run2 on monday
+ M1 from wordcount.Run3 on monday  + M1 from sleep.run1 on monday + M1 from sleep.run2 on
monday. 

{code}

Now, for flows on monday:

{code}
M1 for wordcount on monday = M1 from wordcount.run1 on monday + M1 from wordcount.run2 on
monday + M1 from wordcount.Run3 on monday  
M1 for sleep on monday = M1 from sleep.run1 on monday + M1 from sleep.run2 on monday 
{code}

For timeseries, we need to decide what aggregation means. One option is that we could normalize
the values to a minute level granularity. For example, add up values per min across each time.
So anything that occurred within a minute will be assigned to the top of that minute: eg if
something happening at 2 min 10 seconds is considered to have occurred at 2 min.  That way
we can sum up across flows/users/runs etc.






> [Aggregation] App-level Aggregation for YARN system metrics
> -----------------------------------------------------------
>
>                 Key: YARN-3816
>                 URL: https://issues.apache.org/jira/browse/YARN-3816
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: Application Level Aggregation of Timeline Data.pdf, YARN-3816-poc-v1.patch,
YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: resource (CPU,
Memory) consumption across all containers, number of containers launched/completed/failed,
etc. We need this for apps while they are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be aggregated to show
details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based on Application-level
aggregations rather than raw entity-level data as much less raws need to scan (with filter
out non-aggregated entities, like: events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message