hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3033) [Aggregator wireup] Implement NM starting the ATS writer companion
Date Fri, 20 Feb 2015 02:01:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328453#comment-14328453
] 

Sangjin Lee commented on YARN-3033:
-----------------------------------

Thanks for the well-written proposal [~gtCarrera9]! It looks fine except for one thing IMO:
whether the RM's aggregator needs to use the app-level aggregators.

I'm not convinced that an "Application Level Aggregator inside RM" is needed or beneficial.
The main use case of RM writing application-related data is writing application life-cycle
events. This doesn't represent much volume for each app (at most a few events per app). Furthermore,
it does not require any batching/aggregation of metrics of any kind. But by having the per-app
aggregators it would retain a lot of memory for the duration the apps are alive. And it could
be a significant amount of memory pressure for a big/busy cluster. IMO, it would be a superfluous
abstraction with little benefit. Does the RM aggregator have to use it? Do you see it being
a useful abstraction? If so, how?

In my opinion, it would be far simpler and also perform better if the RM aggregator writes
data to the storage outside the app-level context.

{quote}
If N_app > N_node or N_app >> N_node, we may consider to launch a constant
number of aggregators inside each NodeManager, so the total aggregator entities is
bounded by the number of NMs. The reason we’d like to avoid running too many
aggregators is the pressure on the storage ­ too many writers writing to say HBase
RegionServers. We can override the aggregator mapping in this case.
{quote}
+1 with Junping's comment to keep the model simple. This can also be handled by a different
manner if it is for HBase. One can use a single shared HBase client for all app level aggregators
on a per-node aggregator, which would mitigate that concern. If app level aggregators are
separate processes, it's a different story of course. Also, it's been my observation that
the nodes usually outnumber the active apps unless the apps are real tiny.

> [Aggregator wireup] Implement NM starting the ATS writer companion
> ------------------------------------------------------------------
>
>                 Key: YARN-3033
>                 URL: https://issues.apache.org/jira/browse/YARN-3033
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Sangjin Lee
>            Assignee: Li Lu
>         Attachments: MappingandlaunchingApplevelTimelineaggregators.pdf
>
>
> Per design in YARN-2928, implement node managers starting the ATS writer companion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message